(a) what is metadata ? differentiate between …database engine the core service for storing,...
Post on 14-Mar-2020
5 Views
Preview:
TRANSCRIPT
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Winter-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
Q1
EITHER
(a) What is metadata Differentiate between conventional file processing
system and database system Ans
Metadata is data that describes other data Meta is a prefix that in most information technology
usages means an underlying definition or description
Metadata summarizes basic information about data which can make finding and working with
particular instances of data easier For example author date created and date modified and file
size are examples of very basic document metadata Having the abilty to filter through that
metadata makes it much easier for someone to locate a specific document
In addition to document files metadata is used for images videos spreadsheets and web pages
The use of metadata on web pages can be very important Metadata for web pages contain
descriptions of the pagersquos contents as well as keywords linked to the content These are usually
expressed in the form of metatags The metadata containing the web pagersquos description and
summary is often displayed in search results by search engines making its accuracy and details
very important since it can determine whether a user decides to visit the site or not Metatags are
often evaluated by search engines to help decide a web pagersquos relevance and were used as the
key factor in determining position in a search until the late 1990s The increase in search engine
optimization (SEO) towards the end of the 1990s led to many websites ldquokeyword stuffingrdquo their
metadata to trick search engines making their websites seem more relevant than others Since
then search engines have reduced their reliance on metatags though they are still factored in
when indexing pages Many search engines also try to halt web pagesrsquo ability to thwart their
system by regularly changing their criteria for rankings with Google being notorious for
frequently changing their highly-undisclosed ranking algorithms
Metadata can be created manually or by automated information processing Manual creation
tends to be more accurate allowing the user to input any information they feel is relevant or
needed to help describe the file Automated metadata creation can be much more elementary
usually only displaying information such as file size file extension when the file was created
and who created the file
The difference between file processing system and database approach is as follow
File based system Database system
1 The data and program are inter- dependent 1 The data and program are independent of
each other
2 File-based system caused data redundancy
The data may be duplicated in different files
2 Database system control data redundancy
The data appeared only once in the system
3 File ndashbased system caused data inconsistency
The data in different files may be different that
cause data inconsistency
3 In database system data always consistent
Because data appeared only once
4 The data cannot be shared because data is
distributed in different files
4 In database data is easily shared because data
is stored at one place
5 In file based system data is widely spread Due
to this reason file based system provides poor
security
5 It provides many methods to maintain data
security in the database
6 File based system does not provide consistency
constrains
6 Database system provides a different
consistency constrains to maintain data
integrity in the system
7 File based system is less complex system 7 Database system is very complex system
8 The cost of file processing system is less then
database system
8 The cost of database system is much more
than a file processing system
9 File based system takes much space in the
system and memory is wasted in this approach
9 Database approach store data more
efficiently it takes less space in the system and
memory is not wasted
10 To generate different report to take a crucial
decision is very difficult in file based system
10 The report can be generated very easily in
required format in database system Because
data in database is stored in an organized
manner And easily retrieve to generate
different report
11 File based system does not provide
concurrency facility
11 Database system provides concurrency
facility
12 File based system does not provide data
atomicity functionality
12 Database system provides data atomicity
functionality
13 The cost of file processing system is less than
database system
13 The cost of database system is more than
file processing system
14 It is difficult to maintain as it provides less
controlling facility
14 Database provides many facility to maintain
program
15 If one application fail it does not affects other
files in system
15 If database fail it affects all application that
dependent on database
16 Hardware cost is less than database system 16 Hardware cost is high in database than file
system
(b) What is Database Management System Explain the components of
Database Management
System
Ans
Organizations employ Database Management Systems (or DBMS) to help them effectively
manage their data and derive relevant information out of it A DBMS is a technology tool that
directly supports data management It is a package designed to define manipulate and manage
data in a database
Some general functions of a DBMS
Designed to allow the definition creation querying update and administration of databases
Define rules to validate the data and relieve users of framing programs for data maintenance
Convert an existing database or archive a large and growing one
Run business applications which perform the tasks of managing business processes interacting
with end-users and other applications to capture and analyze data
Some well-known DBMSs are Microsoft SQL Server Microsoft Access Oracle SAP and
others
Components of DBMS
DBMS have several components each performing very significant tasks in the database
management system environment Below is a list of components within the database and its
environment
Software
This is the set of programs used to control and manage the overall database This includes the
DBMS software itself the Operating System the network software being used to share the data
among users and the application programs used to access data in the DBMS
Hardware Consists of a set of physical electronic devices such as computers IO devices storage devices
etc this provides the interface between computers and the real world systems
Data DBMS exists to collect store process and access data the most important component The
database contains both the actual or operational data and the metadata
Procedures These are the instructions and rules that assist on how to use the DBMS and in designing and
running the database using documented procedures to guide the users that operate and manage
it
Database Access Language
This is used to access the data to and from the database to enter new data update existing data
or retrieve required data from databases The user writes a set of appropriate commands in a
database access language submits these to the DBMS which then processes the data and
generates and displays a set of results into a user readable form
Query Processor
This transforms the user queries into a series of low level instructions This reads the online
userrsquos query and translates it into an efficient series of operations in a form capable of being sent
to the run time data manager for execution
Run Time Database Manager Sometimes referred to as the database control system this is the central software component of
the DBMS that interfaces with user-submitted application programs and queries and handles
database access at run time Its function is to convert operations in userrsquos queries It provides
control to maintain the consistency integrity and security of the data
Data Manager Also called the cache manger this is responsible for handling of data in the database providing a
recovery to the system that allows it to recover the data after a failure
Database Engine The core service for storing processing and securing data this provides controlled access and
rapid transaction processing to address the requirements of the most demanding data consuming
applications It is often used to create relational databases for online transaction processing or
online analytical processing data
Data Dictionary This is a reserved space within a database used to store information about the database itself A
data dictionary is a set of read-only table and views containing the different information about
the data used in the enterprise to ensure that database representation of the data follow one
standard as defined in the dictionary
Report Writer Also referred to as the report generator it is a program that extracts information from one or
more files and presents the information in a specified format Most report writers allow the user
to select records that meet certain conditions and to display selected fields in rows and columns
or also format the data into different charts
OR
(c) Explain three level architecture proposal for DBMS8
In the previous tutorial we have seen the DBMS architecture ndash one-tier two-tier and three-tier In
this guide we will discuss the three level DBMS architecture in detail
DBMS Three Level Architecture Diagram
This architecture has three levels
1 External level
2 Conceptual level
3 Internal level
1 External level
It is also called view level The reason this level is called ldquoviewrdquo is because several users can
view their desired data from this level which is internally fetched from database with the help of
conceptual and internal level mapping
The user doesnrsquot need to know the database schema details such as data structure table definition
etc user is only concerned about data which is what returned back to the view level after it has
been fetched from database (present at the internal level)
External level is the ldquotop levelrdquo of the Three Level DBMS Architecture
2 Conceptual level
It is also called logical level The whole design of the database such as relationship among data
schema of data etc are described in this level
Database constraints and security are also implemented in this level of architecture This level is
maintained by DBA (database administrator)
3 Internal level
This level is also known as physical level This level describes how the data is actually stored in
the storage devices This level is also responsible for allocating space to the data This is the
lowest level of the architecture
(d) Explain
(i) Data Independence
o Data independence can be explained using the three-schema architecture
o Data independence refers characteristic of being able to modify the schema at one level
of the database system without altering the schema at the next higher level
There are two types of data independence
1 Logical Data Independence
o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema
o Logical data independence is used to separate the external level from the conceptual
view
o If we do any changes in the conceptual view of the data then the user view of the data
would not be affected
o Logical data independence occurs at the user interface level
2 Physical Data Independence
o Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema
o If we do any changes in the storage size of the database system server then the
Conceptual structure of the database will not be affected
o Physical data independence is used to separate conceptual levels from the internal levels
o Physical data independence occurs at the logical interface level
(ii) Data Integration Ans
Data integration involves combining data residing in different sources and providing users with
a unified view of them[1] This process becomes significant in a variety of situations which
include both commercial (such as when two similar companies need to merge their databases)
and scientific (combining research results from different bioinformatics repositories for
example) domains Data integration appears with increasing frequency as the volume (that is big
data[2]) and the need to share existing data explodes[3] It has become the focus of extensive
theoretical work and numerous open problems remain unsolved Data integration encourages
collaboration between internal as well as external users
Figure 1 Simple schematic for a data warehouse The Extract transform load (ETL) process
extracts information from the source databases transforms it and then loads it into the data
warehouse
Figure 2 Simple schematic for a data-integration solution A system designer constructs a
mediated schema against which users can run queries The virtual databaseinterfaces with the
source databases via wrapper code if required
Issues with combining heterogeneous data sources often referred to as information silos under a
single query interface have existed for some time In the early 1980s computer scientists began
designing systems for interoperability of heterogeneous databases[4] The first data integration
system driven by structured metadata was designed at the University of Minnesota in 1991 for
the Integrated Public Use Microdata Series (IPUMS) IPUMS used a data warehousing approach
which extracts transforms and loads data from heterogeneous sources into a single
view schema so data from different sources become compatible[5] By making thousands of
population databases interoperable IPUMS demonstrated the feasibility of large-scale data
integration The data warehouse approach offers a tightly coupled architecture because the data
are already physically reconciled in a single queryable repository so it usually takes little time to
resolve queries[6]
The data warehouse approach is less feasible for data sets that are frequently updated requiring
the extract transform load(ETL) process to be continuously re-executed for synchronization
Difficulties also arise in constructing data warehouses when one has only a query interface to
summary data sources and no access to the full data This problem frequently emerges when
integrating several commercial query services like travel or classified advertisement web
applications
As of 2009 the trend in data integration favored loosening the coupling between data[citation
needed] and providing a unified query-interface to access real time data over a mediated schema
(see Figure 2) which allows information to be retrieved directly from original databases This is
consistent with the SOA approach popular in that era This approach relies on mappings between
the mediated schema and the schema of original sources and transforming a query into
specialized queries to match the schema of the original databases Such mappings can be
specified in two ways as a mapping from entities in the mediated schema to entities in the
original sources (the Global As View (GAV) approach) or as a mapping from entities in the
original sources to the mediated schema (the Local As View (LAV) approach) The latter
approach requires more sophisticated inferences to resolve a query on the mediated schema but
makes it easier to add new data sources to a (stable) mediated schema
As of 2010 some of the work in data integration research concerns the semantic
integration problem This problem addresses not the structuring of the architecture of the
integration but how to resolve semantic conflicts between heterogeneous data sources For
example if two companies merge their databases certain concepts and definitions in their
respective schemas like earnings inevitably have different meanings In one database it may
mean profits in dollars (a floating-point number) while in the other it might represent the
number of sales (an integer) A common strategy for the resolution of such problems involves the
use of ontologies which explicitly define schema terms and thus help to resolve semantic
conflicts This approach represents ontology-based data integration On the other hand the
problem of combining research results from different bioinformatics repositories requires bench-
marking of the similarities computed from different data sources on a single criterion such as
positive predictive value This enables the data sources to be directly comparable and can be
integrated even when the natures of experiments are distinct[7]
As of 2011 it was determined that current data modeling methods were imparting data isolation
into every data architecture in the form of islands of disparate data and information silos This
data isolation is an unintended artifact of the data modeling methodology that results in the
development of disparate data models Disparate data models when instantiated as databases
form disparate databases Enhanced data model methodologies have been developed to eliminate
the data isolation artifact and to promote the development of integrated data models[8] One
enhanced data modeling method recasts data models by augmenting them with
structural metadata in the form of standardized data entities As a result of recasting multiple data
models the set of recast data models will now share one or more commonality relationships that
relate the structural metadata now common to these data models Commonality relationships are
a peer-to-peer type of entity relationships that relate the standardized data entities of multiple
data models Multiple data models that contain the same standard data entity may participate in
the same commonality relationship When integrated data models are instantiated as databases
and are properly populated from a common set of master data then these databases are
integrated
Since 2011 data hub approaches have been of greater interest than fully structured (typically
relational) Enterprise Data Warehouses Since 2013 data lakeapproaches have risen to the level
of Data Hubs (See all three search terms popularity on Google Trends[9]) These approaches
combine unstructured or varied data into one location but do not necessarily require an (often
complex) master relational schema to structure and define all data in the Hub
Q2
EITHER
(a) Explain E-R Model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise
an iterative process A team-oriented process with all business managers (or
designates) involved should validate with a ldquobottom-uprdquo approach Has three primary
components entity relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an
independent existence and which can be uniquely identified An entity is an abstraction from the
complexities of some domain When we speak of an entity we normally speak of some aspect of
the real world which can be distinguished from other aspects of the real world An entity may be
a physical object such as a house or a car an event such as a house sale or a car service or a
concept such as a customer transaction or order An entity-type is a category An entity strictly
speaking is an instance of a given entity-type There are usually many instances of an entity-
type Because the term entity-type is somewhat cumbersome most people tend to use the term
entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student
name address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another
Relationships can be thought of as verbs linking two or more nouns Examples an owns
relationship between a company and a computer a supervises relationship between an employee
and a department a performs relationship between an artist and a song a proved relationship
between a mathematician and a theorem Relationships are represented as diamonds connected
by lines to each of the entities in the relationship Types of relationships are as follows
One to many 1lt------- M
Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
(b) Given Entity Customer with attributes customer_id(primary key) name(
first_name last_name middle_name) phone_number date_of_birth
address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number
Entity relationship diagram displays the relationships of entity set stored in a database In other
words we can say that ER diagrams help you to explain the logical structure of databases At
first look an ER diagram looks very similar to the flowchart However ER Diagram includes
many specialized symbols and its meanings make this model unique
Sample ER
Diagram
Facts about ER Diagram Model
o ER model allows you to draw Database Design
o It is an easy to use graphical tool for modeling data
o Widely used in Database Design
o It is a GUI representation of the logical structure of a Database
o It helps you to identifies the entities which exist in a system and the relationships
between those entities
(b)Differentiate between Network and Hierarchical data model in DBMS
Ans Hierarchical model
1 One to many or one to one relationships
2 Based on parent child relationship
3 Retrieve algorithms are complex and asymmetric
4 Data Redundancy more
Network model
1 Many to many relationships
2 Many parents as well as many children
3 Retrieve algorithms are complex and symmetric
4 Data Redundancy more
Relational model
1 One to OneOne to many Many to many relationships
2 Based on relational data structures
3 Retrieve algorithms are simple and symmetric
4 Data Redundancy less
OR
(c)Draw E-R diagram on Library Management System
Ans
(d) State advantages and disadvantages of following file organizations
(i) Index-Sequential file
Ans
Sequential File Organization
1 A sequential file is designed for efficient processing of records in sorted order on some
search key
o Records are chained together by pointers to permit fast retrieval in search key
order
o Pointer points to next record in order
o Records are stored physically in search key order (or as close to this as possible)
o This minimizes number of block accesses
o Figure 1015 shows an example with bname as the search key
2 It is difficult to maintain physical sequential order as records are inserted and deleted
o Deletion can be managed with the pointer chains
o Insertion poses problems if no space where new record should go
o If space use it else put new record in an overflow block
o Adjust pointers accordingly
o Figure 1016 shows the previous example after an insertion
o Problem we now have some records out of physical sequential order
o If very few records in overflow blocks this will work well
o If order is lost reorganize the file
o Reorganizations are expensive and done when system load is low
3 If insertions rarely occur we could keep the file in physically sorted order and reorganize
when insertion occurs In this case the pointer fields are no longer required
The Sequential File
Fixed format used for records
Records are the same length
All fields the same (order and length)
Field names and lengths are attributes of the file
One field is the key filed
Uniquely identifies the record
Records are stored in key sequence
The Sequential File
New records are placed in a log file or transaction file
Batch update is performed to merge the log file with the master file
(ii) Direct file
Direct Access File System (DAFS) is a network file system similar to Network File System
(NFS) and Common Internet File System (CIFS) that allows applications to transfer data while
bypassing operating system control buffering and network protocol operations that can
bottleneck throughput DAFS uses the Virtual Interface (VI) architecture as its underlying
transport mechanism Using VI hardware an application transfers data to and from application
buffers without using the operating system which frees up the processor and operating system
for other processes and allows files to be accessed by servers using several different operating
systems DAFS is designed and optimized for clustered shared-file network environments that
are commonly used for Internet e-commerce and database applications DAFS is optimized for
high-bandwidth InfiniBand networks and it works with any interconnection that supports VI
including Fibre Channel and Ethernet
Network Appliance and Intel formed the DAFS Collaborative as an industry group to specify and
promote DAFS Today more than 85 companies are part of the DAFS
Collaborative
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
key factor in determining position in a search until the late 1990s The increase in search engine
optimization (SEO) towards the end of the 1990s led to many websites ldquokeyword stuffingrdquo their
metadata to trick search engines making their websites seem more relevant than others Since
then search engines have reduced their reliance on metatags though they are still factored in
when indexing pages Many search engines also try to halt web pagesrsquo ability to thwart their
system by regularly changing their criteria for rankings with Google being notorious for
frequently changing their highly-undisclosed ranking algorithms
Metadata can be created manually or by automated information processing Manual creation
tends to be more accurate allowing the user to input any information they feel is relevant or
needed to help describe the file Automated metadata creation can be much more elementary
usually only displaying information such as file size file extension when the file was created
and who created the file
The difference between file processing system and database approach is as follow
File based system Database system
1 The data and program are inter- dependent 1 The data and program are independent of
each other
2 File-based system caused data redundancy
The data may be duplicated in different files
2 Database system control data redundancy
The data appeared only once in the system
3 File ndashbased system caused data inconsistency
The data in different files may be different that
cause data inconsistency
3 In database system data always consistent
Because data appeared only once
4 The data cannot be shared because data is
distributed in different files
4 In database data is easily shared because data
is stored at one place
5 In file based system data is widely spread Due
to this reason file based system provides poor
security
5 It provides many methods to maintain data
security in the database
6 File based system does not provide consistency
constrains
6 Database system provides a different
consistency constrains to maintain data
integrity in the system
7 File based system is less complex system 7 Database system is very complex system
8 The cost of file processing system is less then
database system
8 The cost of database system is much more
than a file processing system
9 File based system takes much space in the
system and memory is wasted in this approach
9 Database approach store data more
efficiently it takes less space in the system and
memory is not wasted
10 To generate different report to take a crucial
decision is very difficult in file based system
10 The report can be generated very easily in
required format in database system Because
data in database is stored in an organized
manner And easily retrieve to generate
different report
11 File based system does not provide
concurrency facility
11 Database system provides concurrency
facility
12 File based system does not provide data
atomicity functionality
12 Database system provides data atomicity
functionality
13 The cost of file processing system is less than
database system
13 The cost of database system is more than
file processing system
14 It is difficult to maintain as it provides less
controlling facility
14 Database provides many facility to maintain
program
15 If one application fail it does not affects other
files in system
15 If database fail it affects all application that
dependent on database
16 Hardware cost is less than database system 16 Hardware cost is high in database than file
system
(b) What is Database Management System Explain the components of
Database Management
System
Ans
Organizations employ Database Management Systems (or DBMS) to help them effectively
manage their data and derive relevant information out of it A DBMS is a technology tool that
directly supports data management It is a package designed to define manipulate and manage
data in a database
Some general functions of a DBMS
Designed to allow the definition creation querying update and administration of databases
Define rules to validate the data and relieve users of framing programs for data maintenance
Convert an existing database or archive a large and growing one
Run business applications which perform the tasks of managing business processes interacting
with end-users and other applications to capture and analyze data
Some well-known DBMSs are Microsoft SQL Server Microsoft Access Oracle SAP and
others
Components of DBMS
DBMS have several components each performing very significant tasks in the database
management system environment Below is a list of components within the database and its
environment
Software
This is the set of programs used to control and manage the overall database This includes the
DBMS software itself the Operating System the network software being used to share the data
among users and the application programs used to access data in the DBMS
Hardware Consists of a set of physical electronic devices such as computers IO devices storage devices
etc this provides the interface between computers and the real world systems
Data DBMS exists to collect store process and access data the most important component The
database contains both the actual or operational data and the metadata
Procedures These are the instructions and rules that assist on how to use the DBMS and in designing and
running the database using documented procedures to guide the users that operate and manage
it
Database Access Language
This is used to access the data to and from the database to enter new data update existing data
or retrieve required data from databases The user writes a set of appropriate commands in a
database access language submits these to the DBMS which then processes the data and
generates and displays a set of results into a user readable form
Query Processor
This transforms the user queries into a series of low level instructions This reads the online
userrsquos query and translates it into an efficient series of operations in a form capable of being sent
to the run time data manager for execution
Run Time Database Manager Sometimes referred to as the database control system this is the central software component of
the DBMS that interfaces with user-submitted application programs and queries and handles
database access at run time Its function is to convert operations in userrsquos queries It provides
control to maintain the consistency integrity and security of the data
Data Manager Also called the cache manger this is responsible for handling of data in the database providing a
recovery to the system that allows it to recover the data after a failure
Database Engine The core service for storing processing and securing data this provides controlled access and
rapid transaction processing to address the requirements of the most demanding data consuming
applications It is often used to create relational databases for online transaction processing or
online analytical processing data
Data Dictionary This is a reserved space within a database used to store information about the database itself A
data dictionary is a set of read-only table and views containing the different information about
the data used in the enterprise to ensure that database representation of the data follow one
standard as defined in the dictionary
Report Writer Also referred to as the report generator it is a program that extracts information from one or
more files and presents the information in a specified format Most report writers allow the user
to select records that meet certain conditions and to display selected fields in rows and columns
or also format the data into different charts
OR
(c) Explain three level architecture proposal for DBMS8
In the previous tutorial we have seen the DBMS architecture ndash one-tier two-tier and three-tier In
this guide we will discuss the three level DBMS architecture in detail
DBMS Three Level Architecture Diagram
This architecture has three levels
1 External level
2 Conceptual level
3 Internal level
1 External level
It is also called view level The reason this level is called ldquoviewrdquo is because several users can
view their desired data from this level which is internally fetched from database with the help of
conceptual and internal level mapping
The user doesnrsquot need to know the database schema details such as data structure table definition
etc user is only concerned about data which is what returned back to the view level after it has
been fetched from database (present at the internal level)
External level is the ldquotop levelrdquo of the Three Level DBMS Architecture
2 Conceptual level
It is also called logical level The whole design of the database such as relationship among data
schema of data etc are described in this level
Database constraints and security are also implemented in this level of architecture This level is
maintained by DBA (database administrator)
3 Internal level
This level is also known as physical level This level describes how the data is actually stored in
the storage devices This level is also responsible for allocating space to the data This is the
lowest level of the architecture
(d) Explain
(i) Data Independence
o Data independence can be explained using the three-schema architecture
o Data independence refers characteristic of being able to modify the schema at one level
of the database system without altering the schema at the next higher level
There are two types of data independence
1 Logical Data Independence
o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema
o Logical data independence is used to separate the external level from the conceptual
view
o If we do any changes in the conceptual view of the data then the user view of the data
would not be affected
o Logical data independence occurs at the user interface level
2 Physical Data Independence
o Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema
o If we do any changes in the storage size of the database system server then the
Conceptual structure of the database will not be affected
o Physical data independence is used to separate conceptual levels from the internal levels
o Physical data independence occurs at the logical interface level
(ii) Data Integration Ans
Data integration involves combining data residing in different sources and providing users with
a unified view of them[1] This process becomes significant in a variety of situations which
include both commercial (such as when two similar companies need to merge their databases)
and scientific (combining research results from different bioinformatics repositories for
example) domains Data integration appears with increasing frequency as the volume (that is big
data[2]) and the need to share existing data explodes[3] It has become the focus of extensive
theoretical work and numerous open problems remain unsolved Data integration encourages
collaboration between internal as well as external users
Figure 1 Simple schematic for a data warehouse The Extract transform load (ETL) process
extracts information from the source databases transforms it and then loads it into the data
warehouse
Figure 2 Simple schematic for a data-integration solution A system designer constructs a
mediated schema against which users can run queries The virtual databaseinterfaces with the
source databases via wrapper code if required
Issues with combining heterogeneous data sources often referred to as information silos under a
single query interface have existed for some time In the early 1980s computer scientists began
designing systems for interoperability of heterogeneous databases[4] The first data integration
system driven by structured metadata was designed at the University of Minnesota in 1991 for
the Integrated Public Use Microdata Series (IPUMS) IPUMS used a data warehousing approach
which extracts transforms and loads data from heterogeneous sources into a single
view schema so data from different sources become compatible[5] By making thousands of
population databases interoperable IPUMS demonstrated the feasibility of large-scale data
integration The data warehouse approach offers a tightly coupled architecture because the data
are already physically reconciled in a single queryable repository so it usually takes little time to
resolve queries[6]
The data warehouse approach is less feasible for data sets that are frequently updated requiring
the extract transform load(ETL) process to be continuously re-executed for synchronization
Difficulties also arise in constructing data warehouses when one has only a query interface to
summary data sources and no access to the full data This problem frequently emerges when
integrating several commercial query services like travel or classified advertisement web
applications
As of 2009 the trend in data integration favored loosening the coupling between data[citation
needed] and providing a unified query-interface to access real time data over a mediated schema
(see Figure 2) which allows information to be retrieved directly from original databases This is
consistent with the SOA approach popular in that era This approach relies on mappings between
the mediated schema and the schema of original sources and transforming a query into
specialized queries to match the schema of the original databases Such mappings can be
specified in two ways as a mapping from entities in the mediated schema to entities in the
original sources (the Global As View (GAV) approach) or as a mapping from entities in the
original sources to the mediated schema (the Local As View (LAV) approach) The latter
approach requires more sophisticated inferences to resolve a query on the mediated schema but
makes it easier to add new data sources to a (stable) mediated schema
As of 2010 some of the work in data integration research concerns the semantic
integration problem This problem addresses not the structuring of the architecture of the
integration but how to resolve semantic conflicts between heterogeneous data sources For
example if two companies merge their databases certain concepts and definitions in their
respective schemas like earnings inevitably have different meanings In one database it may
mean profits in dollars (a floating-point number) while in the other it might represent the
number of sales (an integer) A common strategy for the resolution of such problems involves the
use of ontologies which explicitly define schema terms and thus help to resolve semantic
conflicts This approach represents ontology-based data integration On the other hand the
problem of combining research results from different bioinformatics repositories requires bench-
marking of the similarities computed from different data sources on a single criterion such as
positive predictive value This enables the data sources to be directly comparable and can be
integrated even when the natures of experiments are distinct[7]
As of 2011 it was determined that current data modeling methods were imparting data isolation
into every data architecture in the form of islands of disparate data and information silos This
data isolation is an unintended artifact of the data modeling methodology that results in the
development of disparate data models Disparate data models when instantiated as databases
form disparate databases Enhanced data model methodologies have been developed to eliminate
the data isolation artifact and to promote the development of integrated data models[8] One
enhanced data modeling method recasts data models by augmenting them with
structural metadata in the form of standardized data entities As a result of recasting multiple data
models the set of recast data models will now share one or more commonality relationships that
relate the structural metadata now common to these data models Commonality relationships are
a peer-to-peer type of entity relationships that relate the standardized data entities of multiple
data models Multiple data models that contain the same standard data entity may participate in
the same commonality relationship When integrated data models are instantiated as databases
and are properly populated from a common set of master data then these databases are
integrated
Since 2011 data hub approaches have been of greater interest than fully structured (typically
relational) Enterprise Data Warehouses Since 2013 data lakeapproaches have risen to the level
of Data Hubs (See all three search terms popularity on Google Trends[9]) These approaches
combine unstructured or varied data into one location but do not necessarily require an (often
complex) master relational schema to structure and define all data in the Hub
Q2
EITHER
(a) Explain E-R Model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise
an iterative process A team-oriented process with all business managers (or
designates) involved should validate with a ldquobottom-uprdquo approach Has three primary
components entity relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an
independent existence and which can be uniquely identified An entity is an abstraction from the
complexities of some domain When we speak of an entity we normally speak of some aspect of
the real world which can be distinguished from other aspects of the real world An entity may be
a physical object such as a house or a car an event such as a house sale or a car service or a
concept such as a customer transaction or order An entity-type is a category An entity strictly
speaking is an instance of a given entity-type There are usually many instances of an entity-
type Because the term entity-type is somewhat cumbersome most people tend to use the term
entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student
name address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another
Relationships can be thought of as verbs linking two or more nouns Examples an owns
relationship between a company and a computer a supervises relationship between an employee
and a department a performs relationship between an artist and a song a proved relationship
between a mathematician and a theorem Relationships are represented as diamonds connected
by lines to each of the entities in the relationship Types of relationships are as follows
One to many 1lt------- M
Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
(b) Given Entity Customer with attributes customer_id(primary key) name(
first_name last_name middle_name) phone_number date_of_birth
address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number
Entity relationship diagram displays the relationships of entity set stored in a database In other
words we can say that ER diagrams help you to explain the logical structure of databases At
first look an ER diagram looks very similar to the flowchart However ER Diagram includes
many specialized symbols and its meanings make this model unique
Sample ER
Diagram
Facts about ER Diagram Model
o ER model allows you to draw Database Design
o It is an easy to use graphical tool for modeling data
o Widely used in Database Design
o It is a GUI representation of the logical structure of a Database
o It helps you to identifies the entities which exist in a system and the relationships
between those entities
(b)Differentiate between Network and Hierarchical data model in DBMS
Ans Hierarchical model
1 One to many or one to one relationships
2 Based on parent child relationship
3 Retrieve algorithms are complex and asymmetric
4 Data Redundancy more
Network model
1 Many to many relationships
2 Many parents as well as many children
3 Retrieve algorithms are complex and symmetric
4 Data Redundancy more
Relational model
1 One to OneOne to many Many to many relationships
2 Based on relational data structures
3 Retrieve algorithms are simple and symmetric
4 Data Redundancy less
OR
(c)Draw E-R diagram on Library Management System
Ans
(d) State advantages and disadvantages of following file organizations
(i) Index-Sequential file
Ans
Sequential File Organization
1 A sequential file is designed for efficient processing of records in sorted order on some
search key
o Records are chained together by pointers to permit fast retrieval in search key
order
o Pointer points to next record in order
o Records are stored physically in search key order (or as close to this as possible)
o This minimizes number of block accesses
o Figure 1015 shows an example with bname as the search key
2 It is difficult to maintain physical sequential order as records are inserted and deleted
o Deletion can be managed with the pointer chains
o Insertion poses problems if no space where new record should go
o If space use it else put new record in an overflow block
o Adjust pointers accordingly
o Figure 1016 shows the previous example after an insertion
o Problem we now have some records out of physical sequential order
o If very few records in overflow blocks this will work well
o If order is lost reorganize the file
o Reorganizations are expensive and done when system load is low
3 If insertions rarely occur we could keep the file in physically sorted order and reorganize
when insertion occurs In this case the pointer fields are no longer required
The Sequential File
Fixed format used for records
Records are the same length
All fields the same (order and length)
Field names and lengths are attributes of the file
One field is the key filed
Uniquely identifies the record
Records are stored in key sequence
The Sequential File
New records are placed in a log file or transaction file
Batch update is performed to merge the log file with the master file
(ii) Direct file
Direct Access File System (DAFS) is a network file system similar to Network File System
(NFS) and Common Internet File System (CIFS) that allows applications to transfer data while
bypassing operating system control buffering and network protocol operations that can
bottleneck throughput DAFS uses the Virtual Interface (VI) architecture as its underlying
transport mechanism Using VI hardware an application transfers data to and from application
buffers without using the operating system which frees up the processor and operating system
for other processes and allows files to be accessed by servers using several different operating
systems DAFS is designed and optimized for clustered shared-file network environments that
are commonly used for Internet e-commerce and database applications DAFS is optimized for
high-bandwidth InfiniBand networks and it works with any interconnection that supports VI
including Fibre Channel and Ethernet
Network Appliance and Intel formed the DAFS Collaborative as an industry group to specify and
promote DAFS Today more than 85 companies are part of the DAFS
Collaborative
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
data in database is stored in an organized
manner And easily retrieve to generate
different report
11 File based system does not provide
concurrency facility
11 Database system provides concurrency
facility
12 File based system does not provide data
atomicity functionality
12 Database system provides data atomicity
functionality
13 The cost of file processing system is less than
database system
13 The cost of database system is more than
file processing system
14 It is difficult to maintain as it provides less
controlling facility
14 Database provides many facility to maintain
program
15 If one application fail it does not affects other
files in system
15 If database fail it affects all application that
dependent on database
16 Hardware cost is less than database system 16 Hardware cost is high in database than file
system
(b) What is Database Management System Explain the components of
Database Management
System
Ans
Organizations employ Database Management Systems (or DBMS) to help them effectively
manage their data and derive relevant information out of it A DBMS is a technology tool that
directly supports data management It is a package designed to define manipulate and manage
data in a database
Some general functions of a DBMS
Designed to allow the definition creation querying update and administration of databases
Define rules to validate the data and relieve users of framing programs for data maintenance
Convert an existing database or archive a large and growing one
Run business applications which perform the tasks of managing business processes interacting
with end-users and other applications to capture and analyze data
Some well-known DBMSs are Microsoft SQL Server Microsoft Access Oracle SAP and
others
Components of DBMS
DBMS have several components each performing very significant tasks in the database
management system environment Below is a list of components within the database and its
environment
Software
This is the set of programs used to control and manage the overall database This includes the
DBMS software itself the Operating System the network software being used to share the data
among users and the application programs used to access data in the DBMS
Hardware Consists of a set of physical electronic devices such as computers IO devices storage devices
etc this provides the interface between computers and the real world systems
Data DBMS exists to collect store process and access data the most important component The
database contains both the actual or operational data and the metadata
Procedures These are the instructions and rules that assist on how to use the DBMS and in designing and
running the database using documented procedures to guide the users that operate and manage
it
Database Access Language
This is used to access the data to and from the database to enter new data update existing data
or retrieve required data from databases The user writes a set of appropriate commands in a
database access language submits these to the DBMS which then processes the data and
generates and displays a set of results into a user readable form
Query Processor
This transforms the user queries into a series of low level instructions This reads the online
userrsquos query and translates it into an efficient series of operations in a form capable of being sent
to the run time data manager for execution
Run Time Database Manager Sometimes referred to as the database control system this is the central software component of
the DBMS that interfaces with user-submitted application programs and queries and handles
database access at run time Its function is to convert operations in userrsquos queries It provides
control to maintain the consistency integrity and security of the data
Data Manager Also called the cache manger this is responsible for handling of data in the database providing a
recovery to the system that allows it to recover the data after a failure
Database Engine The core service for storing processing and securing data this provides controlled access and
rapid transaction processing to address the requirements of the most demanding data consuming
applications It is often used to create relational databases for online transaction processing or
online analytical processing data
Data Dictionary This is a reserved space within a database used to store information about the database itself A
data dictionary is a set of read-only table and views containing the different information about
the data used in the enterprise to ensure that database representation of the data follow one
standard as defined in the dictionary
Report Writer Also referred to as the report generator it is a program that extracts information from one or
more files and presents the information in a specified format Most report writers allow the user
to select records that meet certain conditions and to display selected fields in rows and columns
or also format the data into different charts
OR
(c) Explain three level architecture proposal for DBMS8
In the previous tutorial we have seen the DBMS architecture ndash one-tier two-tier and three-tier In
this guide we will discuss the three level DBMS architecture in detail
DBMS Three Level Architecture Diagram
This architecture has three levels
1 External level
2 Conceptual level
3 Internal level
1 External level
It is also called view level The reason this level is called ldquoviewrdquo is because several users can
view their desired data from this level which is internally fetched from database with the help of
conceptual and internal level mapping
The user doesnrsquot need to know the database schema details such as data structure table definition
etc user is only concerned about data which is what returned back to the view level after it has
been fetched from database (present at the internal level)
External level is the ldquotop levelrdquo of the Three Level DBMS Architecture
2 Conceptual level
It is also called logical level The whole design of the database such as relationship among data
schema of data etc are described in this level
Database constraints and security are also implemented in this level of architecture This level is
maintained by DBA (database administrator)
3 Internal level
This level is also known as physical level This level describes how the data is actually stored in
the storage devices This level is also responsible for allocating space to the data This is the
lowest level of the architecture
(d) Explain
(i) Data Independence
o Data independence can be explained using the three-schema architecture
o Data independence refers characteristic of being able to modify the schema at one level
of the database system without altering the schema at the next higher level
There are two types of data independence
1 Logical Data Independence
o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema
o Logical data independence is used to separate the external level from the conceptual
view
o If we do any changes in the conceptual view of the data then the user view of the data
would not be affected
o Logical data independence occurs at the user interface level
2 Physical Data Independence
o Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema
o If we do any changes in the storage size of the database system server then the
Conceptual structure of the database will not be affected
o Physical data independence is used to separate conceptual levels from the internal levels
o Physical data independence occurs at the logical interface level
(ii) Data Integration Ans
Data integration involves combining data residing in different sources and providing users with
a unified view of them[1] This process becomes significant in a variety of situations which
include both commercial (such as when two similar companies need to merge their databases)
and scientific (combining research results from different bioinformatics repositories for
example) domains Data integration appears with increasing frequency as the volume (that is big
data[2]) and the need to share existing data explodes[3] It has become the focus of extensive
theoretical work and numerous open problems remain unsolved Data integration encourages
collaboration between internal as well as external users
Figure 1 Simple schematic for a data warehouse The Extract transform load (ETL) process
extracts information from the source databases transforms it and then loads it into the data
warehouse
Figure 2 Simple schematic for a data-integration solution A system designer constructs a
mediated schema against which users can run queries The virtual databaseinterfaces with the
source databases via wrapper code if required
Issues with combining heterogeneous data sources often referred to as information silos under a
single query interface have existed for some time In the early 1980s computer scientists began
designing systems for interoperability of heterogeneous databases[4] The first data integration
system driven by structured metadata was designed at the University of Minnesota in 1991 for
the Integrated Public Use Microdata Series (IPUMS) IPUMS used a data warehousing approach
which extracts transforms and loads data from heterogeneous sources into a single
view schema so data from different sources become compatible[5] By making thousands of
population databases interoperable IPUMS demonstrated the feasibility of large-scale data
integration The data warehouse approach offers a tightly coupled architecture because the data
are already physically reconciled in a single queryable repository so it usually takes little time to
resolve queries[6]
The data warehouse approach is less feasible for data sets that are frequently updated requiring
the extract transform load(ETL) process to be continuously re-executed for synchronization
Difficulties also arise in constructing data warehouses when one has only a query interface to
summary data sources and no access to the full data This problem frequently emerges when
integrating several commercial query services like travel or classified advertisement web
applications
As of 2009 the trend in data integration favored loosening the coupling between data[citation
needed] and providing a unified query-interface to access real time data over a mediated schema
(see Figure 2) which allows information to be retrieved directly from original databases This is
consistent with the SOA approach popular in that era This approach relies on mappings between
the mediated schema and the schema of original sources and transforming a query into
specialized queries to match the schema of the original databases Such mappings can be
specified in two ways as a mapping from entities in the mediated schema to entities in the
original sources (the Global As View (GAV) approach) or as a mapping from entities in the
original sources to the mediated schema (the Local As View (LAV) approach) The latter
approach requires more sophisticated inferences to resolve a query on the mediated schema but
makes it easier to add new data sources to a (stable) mediated schema
As of 2010 some of the work in data integration research concerns the semantic
integration problem This problem addresses not the structuring of the architecture of the
integration but how to resolve semantic conflicts between heterogeneous data sources For
example if two companies merge their databases certain concepts and definitions in their
respective schemas like earnings inevitably have different meanings In one database it may
mean profits in dollars (a floating-point number) while in the other it might represent the
number of sales (an integer) A common strategy for the resolution of such problems involves the
use of ontologies which explicitly define schema terms and thus help to resolve semantic
conflicts This approach represents ontology-based data integration On the other hand the
problem of combining research results from different bioinformatics repositories requires bench-
marking of the similarities computed from different data sources on a single criterion such as
positive predictive value This enables the data sources to be directly comparable and can be
integrated even when the natures of experiments are distinct[7]
As of 2011 it was determined that current data modeling methods were imparting data isolation
into every data architecture in the form of islands of disparate data and information silos This
data isolation is an unintended artifact of the data modeling methodology that results in the
development of disparate data models Disparate data models when instantiated as databases
form disparate databases Enhanced data model methodologies have been developed to eliminate
the data isolation artifact and to promote the development of integrated data models[8] One
enhanced data modeling method recasts data models by augmenting them with
structural metadata in the form of standardized data entities As a result of recasting multiple data
models the set of recast data models will now share one or more commonality relationships that
relate the structural metadata now common to these data models Commonality relationships are
a peer-to-peer type of entity relationships that relate the standardized data entities of multiple
data models Multiple data models that contain the same standard data entity may participate in
the same commonality relationship When integrated data models are instantiated as databases
and are properly populated from a common set of master data then these databases are
integrated
Since 2011 data hub approaches have been of greater interest than fully structured (typically
relational) Enterprise Data Warehouses Since 2013 data lakeapproaches have risen to the level
of Data Hubs (See all three search terms popularity on Google Trends[9]) These approaches
combine unstructured or varied data into one location but do not necessarily require an (often
complex) master relational schema to structure and define all data in the Hub
Q2
EITHER
(a) Explain E-R Model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise
an iterative process A team-oriented process with all business managers (or
designates) involved should validate with a ldquobottom-uprdquo approach Has three primary
components entity relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an
independent existence and which can be uniquely identified An entity is an abstraction from the
complexities of some domain When we speak of an entity we normally speak of some aspect of
the real world which can be distinguished from other aspects of the real world An entity may be
a physical object such as a house or a car an event such as a house sale or a car service or a
concept such as a customer transaction or order An entity-type is a category An entity strictly
speaking is an instance of a given entity-type There are usually many instances of an entity-
type Because the term entity-type is somewhat cumbersome most people tend to use the term
entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student
name address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another
Relationships can be thought of as verbs linking two or more nouns Examples an owns
relationship between a company and a computer a supervises relationship between an employee
and a department a performs relationship between an artist and a song a proved relationship
between a mathematician and a theorem Relationships are represented as diamonds connected
by lines to each of the entities in the relationship Types of relationships are as follows
One to many 1lt------- M
Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
(b) Given Entity Customer with attributes customer_id(primary key) name(
first_name last_name middle_name) phone_number date_of_birth
address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number
Entity relationship diagram displays the relationships of entity set stored in a database In other
words we can say that ER diagrams help you to explain the logical structure of databases At
first look an ER diagram looks very similar to the flowchart However ER Diagram includes
many specialized symbols and its meanings make this model unique
Sample ER
Diagram
Facts about ER Diagram Model
o ER model allows you to draw Database Design
o It is an easy to use graphical tool for modeling data
o Widely used in Database Design
o It is a GUI representation of the logical structure of a Database
o It helps you to identifies the entities which exist in a system and the relationships
between those entities
(b)Differentiate between Network and Hierarchical data model in DBMS
Ans Hierarchical model
1 One to many or one to one relationships
2 Based on parent child relationship
3 Retrieve algorithms are complex and asymmetric
4 Data Redundancy more
Network model
1 Many to many relationships
2 Many parents as well as many children
3 Retrieve algorithms are complex and symmetric
4 Data Redundancy more
Relational model
1 One to OneOne to many Many to many relationships
2 Based on relational data structures
3 Retrieve algorithms are simple and symmetric
4 Data Redundancy less
OR
(c)Draw E-R diagram on Library Management System
Ans
(d) State advantages and disadvantages of following file organizations
(i) Index-Sequential file
Ans
Sequential File Organization
1 A sequential file is designed for efficient processing of records in sorted order on some
search key
o Records are chained together by pointers to permit fast retrieval in search key
order
o Pointer points to next record in order
o Records are stored physically in search key order (or as close to this as possible)
o This minimizes number of block accesses
o Figure 1015 shows an example with bname as the search key
2 It is difficult to maintain physical sequential order as records are inserted and deleted
o Deletion can be managed with the pointer chains
o Insertion poses problems if no space where new record should go
o If space use it else put new record in an overflow block
o Adjust pointers accordingly
o Figure 1016 shows the previous example after an insertion
o Problem we now have some records out of physical sequential order
o If very few records in overflow blocks this will work well
o If order is lost reorganize the file
o Reorganizations are expensive and done when system load is low
3 If insertions rarely occur we could keep the file in physically sorted order and reorganize
when insertion occurs In this case the pointer fields are no longer required
The Sequential File
Fixed format used for records
Records are the same length
All fields the same (order and length)
Field names and lengths are attributes of the file
One field is the key filed
Uniquely identifies the record
Records are stored in key sequence
The Sequential File
New records are placed in a log file or transaction file
Batch update is performed to merge the log file with the master file
(ii) Direct file
Direct Access File System (DAFS) is a network file system similar to Network File System
(NFS) and Common Internet File System (CIFS) that allows applications to transfer data while
bypassing operating system control buffering and network protocol operations that can
bottleneck throughput DAFS uses the Virtual Interface (VI) architecture as its underlying
transport mechanism Using VI hardware an application transfers data to and from application
buffers without using the operating system which frees up the processor and operating system
for other processes and allows files to be accessed by servers using several different operating
systems DAFS is designed and optimized for clustered shared-file network environments that
are commonly used for Internet e-commerce and database applications DAFS is optimized for
high-bandwidth InfiniBand networks and it works with any interconnection that supports VI
including Fibre Channel and Ethernet
Network Appliance and Intel formed the DAFS Collaborative as an industry group to specify and
promote DAFS Today more than 85 companies are part of the DAFS
Collaborative
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Some well-known DBMSs are Microsoft SQL Server Microsoft Access Oracle SAP and
others
Components of DBMS
DBMS have several components each performing very significant tasks in the database
management system environment Below is a list of components within the database and its
environment
Software
This is the set of programs used to control and manage the overall database This includes the
DBMS software itself the Operating System the network software being used to share the data
among users and the application programs used to access data in the DBMS
Hardware Consists of a set of physical electronic devices such as computers IO devices storage devices
etc this provides the interface between computers and the real world systems
Data DBMS exists to collect store process and access data the most important component The
database contains both the actual or operational data and the metadata
Procedures These are the instructions and rules that assist on how to use the DBMS and in designing and
running the database using documented procedures to guide the users that operate and manage
it
Database Access Language
This is used to access the data to and from the database to enter new data update existing data
or retrieve required data from databases The user writes a set of appropriate commands in a
database access language submits these to the DBMS which then processes the data and
generates and displays a set of results into a user readable form
Query Processor
This transforms the user queries into a series of low level instructions This reads the online
userrsquos query and translates it into an efficient series of operations in a form capable of being sent
to the run time data manager for execution
Run Time Database Manager Sometimes referred to as the database control system this is the central software component of
the DBMS that interfaces with user-submitted application programs and queries and handles
database access at run time Its function is to convert operations in userrsquos queries It provides
control to maintain the consistency integrity and security of the data
Data Manager Also called the cache manger this is responsible for handling of data in the database providing a
recovery to the system that allows it to recover the data after a failure
Database Engine The core service for storing processing and securing data this provides controlled access and
rapid transaction processing to address the requirements of the most demanding data consuming
applications It is often used to create relational databases for online transaction processing or
online analytical processing data
Data Dictionary This is a reserved space within a database used to store information about the database itself A
data dictionary is a set of read-only table and views containing the different information about
the data used in the enterprise to ensure that database representation of the data follow one
standard as defined in the dictionary
Report Writer Also referred to as the report generator it is a program that extracts information from one or
more files and presents the information in a specified format Most report writers allow the user
to select records that meet certain conditions and to display selected fields in rows and columns
or also format the data into different charts
OR
(c) Explain three level architecture proposal for DBMS8
In the previous tutorial we have seen the DBMS architecture ndash one-tier two-tier and three-tier In
this guide we will discuss the three level DBMS architecture in detail
DBMS Three Level Architecture Diagram
This architecture has three levels
1 External level
2 Conceptual level
3 Internal level
1 External level
It is also called view level The reason this level is called ldquoviewrdquo is because several users can
view their desired data from this level which is internally fetched from database with the help of
conceptual and internal level mapping
The user doesnrsquot need to know the database schema details such as data structure table definition
etc user is only concerned about data which is what returned back to the view level after it has
been fetched from database (present at the internal level)
External level is the ldquotop levelrdquo of the Three Level DBMS Architecture
2 Conceptual level
It is also called logical level The whole design of the database such as relationship among data
schema of data etc are described in this level
Database constraints and security are also implemented in this level of architecture This level is
maintained by DBA (database administrator)
3 Internal level
This level is also known as physical level This level describes how the data is actually stored in
the storage devices This level is also responsible for allocating space to the data This is the
lowest level of the architecture
(d) Explain
(i) Data Independence
o Data independence can be explained using the three-schema architecture
o Data independence refers characteristic of being able to modify the schema at one level
of the database system without altering the schema at the next higher level
There are two types of data independence
1 Logical Data Independence
o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema
o Logical data independence is used to separate the external level from the conceptual
view
o If we do any changes in the conceptual view of the data then the user view of the data
would not be affected
o Logical data independence occurs at the user interface level
2 Physical Data Independence
o Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema
o If we do any changes in the storage size of the database system server then the
Conceptual structure of the database will not be affected
o Physical data independence is used to separate conceptual levels from the internal levels
o Physical data independence occurs at the logical interface level
(ii) Data Integration Ans
Data integration involves combining data residing in different sources and providing users with
a unified view of them[1] This process becomes significant in a variety of situations which
include both commercial (such as when two similar companies need to merge their databases)
and scientific (combining research results from different bioinformatics repositories for
example) domains Data integration appears with increasing frequency as the volume (that is big
data[2]) and the need to share existing data explodes[3] It has become the focus of extensive
theoretical work and numerous open problems remain unsolved Data integration encourages
collaboration between internal as well as external users
Figure 1 Simple schematic for a data warehouse The Extract transform load (ETL) process
extracts information from the source databases transforms it and then loads it into the data
warehouse
Figure 2 Simple schematic for a data-integration solution A system designer constructs a
mediated schema against which users can run queries The virtual databaseinterfaces with the
source databases via wrapper code if required
Issues with combining heterogeneous data sources often referred to as information silos under a
single query interface have existed for some time In the early 1980s computer scientists began
designing systems for interoperability of heterogeneous databases[4] The first data integration
system driven by structured metadata was designed at the University of Minnesota in 1991 for
the Integrated Public Use Microdata Series (IPUMS) IPUMS used a data warehousing approach
which extracts transforms and loads data from heterogeneous sources into a single
view schema so data from different sources become compatible[5] By making thousands of
population databases interoperable IPUMS demonstrated the feasibility of large-scale data
integration The data warehouse approach offers a tightly coupled architecture because the data
are already physically reconciled in a single queryable repository so it usually takes little time to
resolve queries[6]
The data warehouse approach is less feasible for data sets that are frequently updated requiring
the extract transform load(ETL) process to be continuously re-executed for synchronization
Difficulties also arise in constructing data warehouses when one has only a query interface to
summary data sources and no access to the full data This problem frequently emerges when
integrating several commercial query services like travel or classified advertisement web
applications
As of 2009 the trend in data integration favored loosening the coupling between data[citation
needed] and providing a unified query-interface to access real time data over a mediated schema
(see Figure 2) which allows information to be retrieved directly from original databases This is
consistent with the SOA approach popular in that era This approach relies on mappings between
the mediated schema and the schema of original sources and transforming a query into
specialized queries to match the schema of the original databases Such mappings can be
specified in two ways as a mapping from entities in the mediated schema to entities in the
original sources (the Global As View (GAV) approach) or as a mapping from entities in the
original sources to the mediated schema (the Local As View (LAV) approach) The latter
approach requires more sophisticated inferences to resolve a query on the mediated schema but
makes it easier to add new data sources to a (stable) mediated schema
As of 2010 some of the work in data integration research concerns the semantic
integration problem This problem addresses not the structuring of the architecture of the
integration but how to resolve semantic conflicts between heterogeneous data sources For
example if two companies merge their databases certain concepts and definitions in their
respective schemas like earnings inevitably have different meanings In one database it may
mean profits in dollars (a floating-point number) while in the other it might represent the
number of sales (an integer) A common strategy for the resolution of such problems involves the
use of ontologies which explicitly define schema terms and thus help to resolve semantic
conflicts This approach represents ontology-based data integration On the other hand the
problem of combining research results from different bioinformatics repositories requires bench-
marking of the similarities computed from different data sources on a single criterion such as
positive predictive value This enables the data sources to be directly comparable and can be
integrated even when the natures of experiments are distinct[7]
As of 2011 it was determined that current data modeling methods were imparting data isolation
into every data architecture in the form of islands of disparate data and information silos This
data isolation is an unintended artifact of the data modeling methodology that results in the
development of disparate data models Disparate data models when instantiated as databases
form disparate databases Enhanced data model methodologies have been developed to eliminate
the data isolation artifact and to promote the development of integrated data models[8] One
enhanced data modeling method recasts data models by augmenting them with
structural metadata in the form of standardized data entities As a result of recasting multiple data
models the set of recast data models will now share one or more commonality relationships that
relate the structural metadata now common to these data models Commonality relationships are
a peer-to-peer type of entity relationships that relate the standardized data entities of multiple
data models Multiple data models that contain the same standard data entity may participate in
the same commonality relationship When integrated data models are instantiated as databases
and are properly populated from a common set of master data then these databases are
integrated
Since 2011 data hub approaches have been of greater interest than fully structured (typically
relational) Enterprise Data Warehouses Since 2013 data lakeapproaches have risen to the level
of Data Hubs (See all three search terms popularity on Google Trends[9]) These approaches
combine unstructured or varied data into one location but do not necessarily require an (often
complex) master relational schema to structure and define all data in the Hub
Q2
EITHER
(a) Explain E-R Model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise
an iterative process A team-oriented process with all business managers (or
designates) involved should validate with a ldquobottom-uprdquo approach Has three primary
components entity relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an
independent existence and which can be uniquely identified An entity is an abstraction from the
complexities of some domain When we speak of an entity we normally speak of some aspect of
the real world which can be distinguished from other aspects of the real world An entity may be
a physical object such as a house or a car an event such as a house sale or a car service or a
concept such as a customer transaction or order An entity-type is a category An entity strictly
speaking is an instance of a given entity-type There are usually many instances of an entity-
type Because the term entity-type is somewhat cumbersome most people tend to use the term
entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student
name address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another
Relationships can be thought of as verbs linking two or more nouns Examples an owns
relationship between a company and a computer a supervises relationship between an employee
and a department a performs relationship between an artist and a song a proved relationship
between a mathematician and a theorem Relationships are represented as diamonds connected
by lines to each of the entities in the relationship Types of relationships are as follows
One to many 1lt------- M
Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
(b) Given Entity Customer with attributes customer_id(primary key) name(
first_name last_name middle_name) phone_number date_of_birth
address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number
Entity relationship diagram displays the relationships of entity set stored in a database In other
words we can say that ER diagrams help you to explain the logical structure of databases At
first look an ER diagram looks very similar to the flowchart However ER Diagram includes
many specialized symbols and its meanings make this model unique
Sample ER
Diagram
Facts about ER Diagram Model
o ER model allows you to draw Database Design
o It is an easy to use graphical tool for modeling data
o Widely used in Database Design
o It is a GUI representation of the logical structure of a Database
o It helps you to identifies the entities which exist in a system and the relationships
between those entities
(b)Differentiate between Network and Hierarchical data model in DBMS
Ans Hierarchical model
1 One to many or one to one relationships
2 Based on parent child relationship
3 Retrieve algorithms are complex and asymmetric
4 Data Redundancy more
Network model
1 Many to many relationships
2 Many parents as well as many children
3 Retrieve algorithms are complex and symmetric
4 Data Redundancy more
Relational model
1 One to OneOne to many Many to many relationships
2 Based on relational data structures
3 Retrieve algorithms are simple and symmetric
4 Data Redundancy less
OR
(c)Draw E-R diagram on Library Management System
Ans
(d) State advantages and disadvantages of following file organizations
(i) Index-Sequential file
Ans
Sequential File Organization
1 A sequential file is designed for efficient processing of records in sorted order on some
search key
o Records are chained together by pointers to permit fast retrieval in search key
order
o Pointer points to next record in order
o Records are stored physically in search key order (or as close to this as possible)
o This minimizes number of block accesses
o Figure 1015 shows an example with bname as the search key
2 It is difficult to maintain physical sequential order as records are inserted and deleted
o Deletion can be managed with the pointer chains
o Insertion poses problems if no space where new record should go
o If space use it else put new record in an overflow block
o Adjust pointers accordingly
o Figure 1016 shows the previous example after an insertion
o Problem we now have some records out of physical sequential order
o If very few records in overflow blocks this will work well
o If order is lost reorganize the file
o Reorganizations are expensive and done when system load is low
3 If insertions rarely occur we could keep the file in physically sorted order and reorganize
when insertion occurs In this case the pointer fields are no longer required
The Sequential File
Fixed format used for records
Records are the same length
All fields the same (order and length)
Field names and lengths are attributes of the file
One field is the key filed
Uniquely identifies the record
Records are stored in key sequence
The Sequential File
New records are placed in a log file or transaction file
Batch update is performed to merge the log file with the master file
(ii) Direct file
Direct Access File System (DAFS) is a network file system similar to Network File System
(NFS) and Common Internet File System (CIFS) that allows applications to transfer data while
bypassing operating system control buffering and network protocol operations that can
bottleneck throughput DAFS uses the Virtual Interface (VI) architecture as its underlying
transport mechanism Using VI hardware an application transfers data to and from application
buffers without using the operating system which frees up the processor and operating system
for other processes and allows files to be accessed by servers using several different operating
systems DAFS is designed and optimized for clustered shared-file network environments that
are commonly used for Internet e-commerce and database applications DAFS is optimized for
high-bandwidth InfiniBand networks and it works with any interconnection that supports VI
including Fibre Channel and Ethernet
Network Appliance and Intel formed the DAFS Collaborative as an industry group to specify and
promote DAFS Today more than 85 companies are part of the DAFS
Collaborative
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Database Engine The core service for storing processing and securing data this provides controlled access and
rapid transaction processing to address the requirements of the most demanding data consuming
applications It is often used to create relational databases for online transaction processing or
online analytical processing data
Data Dictionary This is a reserved space within a database used to store information about the database itself A
data dictionary is a set of read-only table and views containing the different information about
the data used in the enterprise to ensure that database representation of the data follow one
standard as defined in the dictionary
Report Writer Also referred to as the report generator it is a program that extracts information from one or
more files and presents the information in a specified format Most report writers allow the user
to select records that meet certain conditions and to display selected fields in rows and columns
or also format the data into different charts
OR
(c) Explain three level architecture proposal for DBMS8
In the previous tutorial we have seen the DBMS architecture ndash one-tier two-tier and three-tier In
this guide we will discuss the three level DBMS architecture in detail
DBMS Three Level Architecture Diagram
This architecture has three levels
1 External level
2 Conceptual level
3 Internal level
1 External level
It is also called view level The reason this level is called ldquoviewrdquo is because several users can
view their desired data from this level which is internally fetched from database with the help of
conceptual and internal level mapping
The user doesnrsquot need to know the database schema details such as data structure table definition
etc user is only concerned about data which is what returned back to the view level after it has
been fetched from database (present at the internal level)
External level is the ldquotop levelrdquo of the Three Level DBMS Architecture
2 Conceptual level
It is also called logical level The whole design of the database such as relationship among data
schema of data etc are described in this level
Database constraints and security are also implemented in this level of architecture This level is
maintained by DBA (database administrator)
3 Internal level
This level is also known as physical level This level describes how the data is actually stored in
the storage devices This level is also responsible for allocating space to the data This is the
lowest level of the architecture
(d) Explain
(i) Data Independence
o Data independence can be explained using the three-schema architecture
o Data independence refers characteristic of being able to modify the schema at one level
of the database system without altering the schema at the next higher level
There are two types of data independence
1 Logical Data Independence
o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema
o Logical data independence is used to separate the external level from the conceptual
view
o If we do any changes in the conceptual view of the data then the user view of the data
would not be affected
o Logical data independence occurs at the user interface level
2 Physical Data Independence
o Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema
o If we do any changes in the storage size of the database system server then the
Conceptual structure of the database will not be affected
o Physical data independence is used to separate conceptual levels from the internal levels
o Physical data independence occurs at the logical interface level
(ii) Data Integration Ans
Data integration involves combining data residing in different sources and providing users with
a unified view of them[1] This process becomes significant in a variety of situations which
include both commercial (such as when two similar companies need to merge their databases)
and scientific (combining research results from different bioinformatics repositories for
example) domains Data integration appears with increasing frequency as the volume (that is big
data[2]) and the need to share existing data explodes[3] It has become the focus of extensive
theoretical work and numerous open problems remain unsolved Data integration encourages
collaboration between internal as well as external users
Figure 1 Simple schematic for a data warehouse The Extract transform load (ETL) process
extracts information from the source databases transforms it and then loads it into the data
warehouse
Figure 2 Simple schematic for a data-integration solution A system designer constructs a
mediated schema against which users can run queries The virtual databaseinterfaces with the
source databases via wrapper code if required
Issues with combining heterogeneous data sources often referred to as information silos under a
single query interface have existed for some time In the early 1980s computer scientists began
designing systems for interoperability of heterogeneous databases[4] The first data integration
system driven by structured metadata was designed at the University of Minnesota in 1991 for
the Integrated Public Use Microdata Series (IPUMS) IPUMS used a data warehousing approach
which extracts transforms and loads data from heterogeneous sources into a single
view schema so data from different sources become compatible[5] By making thousands of
population databases interoperable IPUMS demonstrated the feasibility of large-scale data
integration The data warehouse approach offers a tightly coupled architecture because the data
are already physically reconciled in a single queryable repository so it usually takes little time to
resolve queries[6]
The data warehouse approach is less feasible for data sets that are frequently updated requiring
the extract transform load(ETL) process to be continuously re-executed for synchronization
Difficulties also arise in constructing data warehouses when one has only a query interface to
summary data sources and no access to the full data This problem frequently emerges when
integrating several commercial query services like travel or classified advertisement web
applications
As of 2009 the trend in data integration favored loosening the coupling between data[citation
needed] and providing a unified query-interface to access real time data over a mediated schema
(see Figure 2) which allows information to be retrieved directly from original databases This is
consistent with the SOA approach popular in that era This approach relies on mappings between
the mediated schema and the schema of original sources and transforming a query into
specialized queries to match the schema of the original databases Such mappings can be
specified in two ways as a mapping from entities in the mediated schema to entities in the
original sources (the Global As View (GAV) approach) or as a mapping from entities in the
original sources to the mediated schema (the Local As View (LAV) approach) The latter
approach requires more sophisticated inferences to resolve a query on the mediated schema but
makes it easier to add new data sources to a (stable) mediated schema
As of 2010 some of the work in data integration research concerns the semantic
integration problem This problem addresses not the structuring of the architecture of the
integration but how to resolve semantic conflicts between heterogeneous data sources For
example if two companies merge their databases certain concepts and definitions in their
respective schemas like earnings inevitably have different meanings In one database it may
mean profits in dollars (a floating-point number) while in the other it might represent the
number of sales (an integer) A common strategy for the resolution of such problems involves the
use of ontologies which explicitly define schema terms and thus help to resolve semantic
conflicts This approach represents ontology-based data integration On the other hand the
problem of combining research results from different bioinformatics repositories requires bench-
marking of the similarities computed from different data sources on a single criterion such as
positive predictive value This enables the data sources to be directly comparable and can be
integrated even when the natures of experiments are distinct[7]
As of 2011 it was determined that current data modeling methods were imparting data isolation
into every data architecture in the form of islands of disparate data and information silos This
data isolation is an unintended artifact of the data modeling methodology that results in the
development of disparate data models Disparate data models when instantiated as databases
form disparate databases Enhanced data model methodologies have been developed to eliminate
the data isolation artifact and to promote the development of integrated data models[8] One
enhanced data modeling method recasts data models by augmenting them with
structural metadata in the form of standardized data entities As a result of recasting multiple data
models the set of recast data models will now share one or more commonality relationships that
relate the structural metadata now common to these data models Commonality relationships are
a peer-to-peer type of entity relationships that relate the standardized data entities of multiple
data models Multiple data models that contain the same standard data entity may participate in
the same commonality relationship When integrated data models are instantiated as databases
and are properly populated from a common set of master data then these databases are
integrated
Since 2011 data hub approaches have been of greater interest than fully structured (typically
relational) Enterprise Data Warehouses Since 2013 data lakeapproaches have risen to the level
of Data Hubs (See all three search terms popularity on Google Trends[9]) These approaches
combine unstructured or varied data into one location but do not necessarily require an (often
complex) master relational schema to structure and define all data in the Hub
Q2
EITHER
(a) Explain E-R Model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise
an iterative process A team-oriented process with all business managers (or
designates) involved should validate with a ldquobottom-uprdquo approach Has three primary
components entity relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an
independent existence and which can be uniquely identified An entity is an abstraction from the
complexities of some domain When we speak of an entity we normally speak of some aspect of
the real world which can be distinguished from other aspects of the real world An entity may be
a physical object such as a house or a car an event such as a house sale or a car service or a
concept such as a customer transaction or order An entity-type is a category An entity strictly
speaking is an instance of a given entity-type There are usually many instances of an entity-
type Because the term entity-type is somewhat cumbersome most people tend to use the term
entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student
name address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another
Relationships can be thought of as verbs linking two or more nouns Examples an owns
relationship between a company and a computer a supervises relationship between an employee
and a department a performs relationship between an artist and a song a proved relationship
between a mathematician and a theorem Relationships are represented as diamonds connected
by lines to each of the entities in the relationship Types of relationships are as follows
One to many 1lt------- M
Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
(b) Given Entity Customer with attributes customer_id(primary key) name(
first_name last_name middle_name) phone_number date_of_birth
address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number
Entity relationship diagram displays the relationships of entity set stored in a database In other
words we can say that ER diagrams help you to explain the logical structure of databases At
first look an ER diagram looks very similar to the flowchart However ER Diagram includes
many specialized symbols and its meanings make this model unique
Sample ER
Diagram
Facts about ER Diagram Model
o ER model allows you to draw Database Design
o It is an easy to use graphical tool for modeling data
o Widely used in Database Design
o It is a GUI representation of the logical structure of a Database
o It helps you to identifies the entities which exist in a system and the relationships
between those entities
(b)Differentiate between Network and Hierarchical data model in DBMS
Ans Hierarchical model
1 One to many or one to one relationships
2 Based on parent child relationship
3 Retrieve algorithms are complex and asymmetric
4 Data Redundancy more
Network model
1 Many to many relationships
2 Many parents as well as many children
3 Retrieve algorithms are complex and symmetric
4 Data Redundancy more
Relational model
1 One to OneOne to many Many to many relationships
2 Based on relational data structures
3 Retrieve algorithms are simple and symmetric
4 Data Redundancy less
OR
(c)Draw E-R diagram on Library Management System
Ans
(d) State advantages and disadvantages of following file organizations
(i) Index-Sequential file
Ans
Sequential File Organization
1 A sequential file is designed for efficient processing of records in sorted order on some
search key
o Records are chained together by pointers to permit fast retrieval in search key
order
o Pointer points to next record in order
o Records are stored physically in search key order (or as close to this as possible)
o This minimizes number of block accesses
o Figure 1015 shows an example with bname as the search key
2 It is difficult to maintain physical sequential order as records are inserted and deleted
o Deletion can be managed with the pointer chains
o Insertion poses problems if no space where new record should go
o If space use it else put new record in an overflow block
o Adjust pointers accordingly
o Figure 1016 shows the previous example after an insertion
o Problem we now have some records out of physical sequential order
o If very few records in overflow blocks this will work well
o If order is lost reorganize the file
o Reorganizations are expensive and done when system load is low
3 If insertions rarely occur we could keep the file in physically sorted order and reorganize
when insertion occurs In this case the pointer fields are no longer required
The Sequential File
Fixed format used for records
Records are the same length
All fields the same (order and length)
Field names and lengths are attributes of the file
One field is the key filed
Uniquely identifies the record
Records are stored in key sequence
The Sequential File
New records are placed in a log file or transaction file
Batch update is performed to merge the log file with the master file
(ii) Direct file
Direct Access File System (DAFS) is a network file system similar to Network File System
(NFS) and Common Internet File System (CIFS) that allows applications to transfer data while
bypassing operating system control buffering and network protocol operations that can
bottleneck throughput DAFS uses the Virtual Interface (VI) architecture as its underlying
transport mechanism Using VI hardware an application transfers data to and from application
buffers without using the operating system which frees up the processor and operating system
for other processes and allows files to be accessed by servers using several different operating
systems DAFS is designed and optimized for clustered shared-file network environments that
are commonly used for Internet e-commerce and database applications DAFS is optimized for
high-bandwidth InfiniBand networks and it works with any interconnection that supports VI
including Fibre Channel and Ethernet
Network Appliance and Intel formed the DAFS Collaborative as an industry group to specify and
promote DAFS Today more than 85 companies are part of the DAFS
Collaborative
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
This architecture has three levels
1 External level
2 Conceptual level
3 Internal level
1 External level
It is also called view level The reason this level is called ldquoviewrdquo is because several users can
view their desired data from this level which is internally fetched from database with the help of
conceptual and internal level mapping
The user doesnrsquot need to know the database schema details such as data structure table definition
etc user is only concerned about data which is what returned back to the view level after it has
been fetched from database (present at the internal level)
External level is the ldquotop levelrdquo of the Three Level DBMS Architecture
2 Conceptual level
It is also called logical level The whole design of the database such as relationship among data
schema of data etc are described in this level
Database constraints and security are also implemented in this level of architecture This level is
maintained by DBA (database administrator)
3 Internal level
This level is also known as physical level This level describes how the data is actually stored in
the storage devices This level is also responsible for allocating space to the data This is the
lowest level of the architecture
(d) Explain
(i) Data Independence
o Data independence can be explained using the three-schema architecture
o Data independence refers characteristic of being able to modify the schema at one level
of the database system without altering the schema at the next higher level
There are two types of data independence
1 Logical Data Independence
o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema
o Logical data independence is used to separate the external level from the conceptual
view
o If we do any changes in the conceptual view of the data then the user view of the data
would not be affected
o Logical data independence occurs at the user interface level
2 Physical Data Independence
o Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema
o If we do any changes in the storage size of the database system server then the
Conceptual structure of the database will not be affected
o Physical data independence is used to separate conceptual levels from the internal levels
o Physical data independence occurs at the logical interface level
(ii) Data Integration Ans
Data integration involves combining data residing in different sources and providing users with
a unified view of them[1] This process becomes significant in a variety of situations which
include both commercial (such as when two similar companies need to merge their databases)
and scientific (combining research results from different bioinformatics repositories for
example) domains Data integration appears with increasing frequency as the volume (that is big
data[2]) and the need to share existing data explodes[3] It has become the focus of extensive
theoretical work and numerous open problems remain unsolved Data integration encourages
collaboration between internal as well as external users
Figure 1 Simple schematic for a data warehouse The Extract transform load (ETL) process
extracts information from the source databases transforms it and then loads it into the data
warehouse
Figure 2 Simple schematic for a data-integration solution A system designer constructs a
mediated schema against which users can run queries The virtual databaseinterfaces with the
source databases via wrapper code if required
Issues with combining heterogeneous data sources often referred to as information silos under a
single query interface have existed for some time In the early 1980s computer scientists began
designing systems for interoperability of heterogeneous databases[4] The first data integration
system driven by structured metadata was designed at the University of Minnesota in 1991 for
the Integrated Public Use Microdata Series (IPUMS) IPUMS used a data warehousing approach
which extracts transforms and loads data from heterogeneous sources into a single
view schema so data from different sources become compatible[5] By making thousands of
population databases interoperable IPUMS demonstrated the feasibility of large-scale data
integration The data warehouse approach offers a tightly coupled architecture because the data
are already physically reconciled in a single queryable repository so it usually takes little time to
resolve queries[6]
The data warehouse approach is less feasible for data sets that are frequently updated requiring
the extract transform load(ETL) process to be continuously re-executed for synchronization
Difficulties also arise in constructing data warehouses when one has only a query interface to
summary data sources and no access to the full data This problem frequently emerges when
integrating several commercial query services like travel or classified advertisement web
applications
As of 2009 the trend in data integration favored loosening the coupling between data[citation
needed] and providing a unified query-interface to access real time data over a mediated schema
(see Figure 2) which allows information to be retrieved directly from original databases This is
consistent with the SOA approach popular in that era This approach relies on mappings between
the mediated schema and the schema of original sources and transforming a query into
specialized queries to match the schema of the original databases Such mappings can be
specified in two ways as a mapping from entities in the mediated schema to entities in the
original sources (the Global As View (GAV) approach) or as a mapping from entities in the
original sources to the mediated schema (the Local As View (LAV) approach) The latter
approach requires more sophisticated inferences to resolve a query on the mediated schema but
makes it easier to add new data sources to a (stable) mediated schema
As of 2010 some of the work in data integration research concerns the semantic
integration problem This problem addresses not the structuring of the architecture of the
integration but how to resolve semantic conflicts between heterogeneous data sources For
example if two companies merge their databases certain concepts and definitions in their
respective schemas like earnings inevitably have different meanings In one database it may
mean profits in dollars (a floating-point number) while in the other it might represent the
number of sales (an integer) A common strategy for the resolution of such problems involves the
use of ontologies which explicitly define schema terms and thus help to resolve semantic
conflicts This approach represents ontology-based data integration On the other hand the
problem of combining research results from different bioinformatics repositories requires bench-
marking of the similarities computed from different data sources on a single criterion such as
positive predictive value This enables the data sources to be directly comparable and can be
integrated even when the natures of experiments are distinct[7]
As of 2011 it was determined that current data modeling methods were imparting data isolation
into every data architecture in the form of islands of disparate data and information silos This
data isolation is an unintended artifact of the data modeling methodology that results in the
development of disparate data models Disparate data models when instantiated as databases
form disparate databases Enhanced data model methodologies have been developed to eliminate
the data isolation artifact and to promote the development of integrated data models[8] One
enhanced data modeling method recasts data models by augmenting them with
structural metadata in the form of standardized data entities As a result of recasting multiple data
models the set of recast data models will now share one or more commonality relationships that
relate the structural metadata now common to these data models Commonality relationships are
a peer-to-peer type of entity relationships that relate the standardized data entities of multiple
data models Multiple data models that contain the same standard data entity may participate in
the same commonality relationship When integrated data models are instantiated as databases
and are properly populated from a common set of master data then these databases are
integrated
Since 2011 data hub approaches have been of greater interest than fully structured (typically
relational) Enterprise Data Warehouses Since 2013 data lakeapproaches have risen to the level
of Data Hubs (See all three search terms popularity on Google Trends[9]) These approaches
combine unstructured or varied data into one location but do not necessarily require an (often
complex) master relational schema to structure and define all data in the Hub
Q2
EITHER
(a) Explain E-R Model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise
an iterative process A team-oriented process with all business managers (or
designates) involved should validate with a ldquobottom-uprdquo approach Has three primary
components entity relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an
independent existence and which can be uniquely identified An entity is an abstraction from the
complexities of some domain When we speak of an entity we normally speak of some aspect of
the real world which can be distinguished from other aspects of the real world An entity may be
a physical object such as a house or a car an event such as a house sale or a car service or a
concept such as a customer transaction or order An entity-type is a category An entity strictly
speaking is an instance of a given entity-type There are usually many instances of an entity-
type Because the term entity-type is somewhat cumbersome most people tend to use the term
entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student
name address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another
Relationships can be thought of as verbs linking two or more nouns Examples an owns
relationship between a company and a computer a supervises relationship between an employee
and a department a performs relationship between an artist and a song a proved relationship
between a mathematician and a theorem Relationships are represented as diamonds connected
by lines to each of the entities in the relationship Types of relationships are as follows
One to many 1lt------- M
Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
(b) Given Entity Customer with attributes customer_id(primary key) name(
first_name last_name middle_name) phone_number date_of_birth
address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number
Entity relationship diagram displays the relationships of entity set stored in a database In other
words we can say that ER diagrams help you to explain the logical structure of databases At
first look an ER diagram looks very similar to the flowchart However ER Diagram includes
many specialized symbols and its meanings make this model unique
Sample ER
Diagram
Facts about ER Diagram Model
o ER model allows you to draw Database Design
o It is an easy to use graphical tool for modeling data
o Widely used in Database Design
o It is a GUI representation of the logical structure of a Database
o It helps you to identifies the entities which exist in a system and the relationships
between those entities
(b)Differentiate between Network and Hierarchical data model in DBMS
Ans Hierarchical model
1 One to many or one to one relationships
2 Based on parent child relationship
3 Retrieve algorithms are complex and asymmetric
4 Data Redundancy more
Network model
1 Many to many relationships
2 Many parents as well as many children
3 Retrieve algorithms are complex and symmetric
4 Data Redundancy more
Relational model
1 One to OneOne to many Many to many relationships
2 Based on relational data structures
3 Retrieve algorithms are simple and symmetric
4 Data Redundancy less
OR
(c)Draw E-R diagram on Library Management System
Ans
(d) State advantages and disadvantages of following file organizations
(i) Index-Sequential file
Ans
Sequential File Organization
1 A sequential file is designed for efficient processing of records in sorted order on some
search key
o Records are chained together by pointers to permit fast retrieval in search key
order
o Pointer points to next record in order
o Records are stored physically in search key order (or as close to this as possible)
o This minimizes number of block accesses
o Figure 1015 shows an example with bname as the search key
2 It is difficult to maintain physical sequential order as records are inserted and deleted
o Deletion can be managed with the pointer chains
o Insertion poses problems if no space where new record should go
o If space use it else put new record in an overflow block
o Adjust pointers accordingly
o Figure 1016 shows the previous example after an insertion
o Problem we now have some records out of physical sequential order
o If very few records in overflow blocks this will work well
o If order is lost reorganize the file
o Reorganizations are expensive and done when system load is low
3 If insertions rarely occur we could keep the file in physically sorted order and reorganize
when insertion occurs In this case the pointer fields are no longer required
The Sequential File
Fixed format used for records
Records are the same length
All fields the same (order and length)
Field names and lengths are attributes of the file
One field is the key filed
Uniquely identifies the record
Records are stored in key sequence
The Sequential File
New records are placed in a log file or transaction file
Batch update is performed to merge the log file with the master file
(ii) Direct file
Direct Access File System (DAFS) is a network file system similar to Network File System
(NFS) and Common Internet File System (CIFS) that allows applications to transfer data while
bypassing operating system control buffering and network protocol operations that can
bottleneck throughput DAFS uses the Virtual Interface (VI) architecture as its underlying
transport mechanism Using VI hardware an application transfers data to and from application
buffers without using the operating system which frees up the processor and operating system
for other processes and allows files to be accessed by servers using several different operating
systems DAFS is designed and optimized for clustered shared-file network environments that
are commonly used for Internet e-commerce and database applications DAFS is optimized for
high-bandwidth InfiniBand networks and it works with any interconnection that supports VI
including Fibre Channel and Ethernet
Network Appliance and Intel formed the DAFS Collaborative as an industry group to specify and
promote DAFS Today more than 85 companies are part of the DAFS
Collaborative
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
2 Conceptual level
It is also called logical level The whole design of the database such as relationship among data
schema of data etc are described in this level
Database constraints and security are also implemented in this level of architecture This level is
maintained by DBA (database administrator)
3 Internal level
This level is also known as physical level This level describes how the data is actually stored in
the storage devices This level is also responsible for allocating space to the data This is the
lowest level of the architecture
(d) Explain
(i) Data Independence
o Data independence can be explained using the three-schema architecture
o Data independence refers characteristic of being able to modify the schema at one level
of the database system without altering the schema at the next higher level
There are two types of data independence
1 Logical Data Independence
o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema
o Logical data independence is used to separate the external level from the conceptual
view
o If we do any changes in the conceptual view of the data then the user view of the data
would not be affected
o Logical data independence occurs at the user interface level
2 Physical Data Independence
o Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema
o If we do any changes in the storage size of the database system server then the
Conceptual structure of the database will not be affected
o Physical data independence is used to separate conceptual levels from the internal levels
o Physical data independence occurs at the logical interface level
(ii) Data Integration Ans
Data integration involves combining data residing in different sources and providing users with
a unified view of them[1] This process becomes significant in a variety of situations which
include both commercial (such as when two similar companies need to merge their databases)
and scientific (combining research results from different bioinformatics repositories for
example) domains Data integration appears with increasing frequency as the volume (that is big
data[2]) and the need to share existing data explodes[3] It has become the focus of extensive
theoretical work and numerous open problems remain unsolved Data integration encourages
collaboration between internal as well as external users
Figure 1 Simple schematic for a data warehouse The Extract transform load (ETL) process
extracts information from the source databases transforms it and then loads it into the data
warehouse
Figure 2 Simple schematic for a data-integration solution A system designer constructs a
mediated schema against which users can run queries The virtual databaseinterfaces with the
source databases via wrapper code if required
Issues with combining heterogeneous data sources often referred to as information silos under a
single query interface have existed for some time In the early 1980s computer scientists began
designing systems for interoperability of heterogeneous databases[4] The first data integration
system driven by structured metadata was designed at the University of Minnesota in 1991 for
the Integrated Public Use Microdata Series (IPUMS) IPUMS used a data warehousing approach
which extracts transforms and loads data from heterogeneous sources into a single
view schema so data from different sources become compatible[5] By making thousands of
population databases interoperable IPUMS demonstrated the feasibility of large-scale data
integration The data warehouse approach offers a tightly coupled architecture because the data
are already physically reconciled in a single queryable repository so it usually takes little time to
resolve queries[6]
The data warehouse approach is less feasible for data sets that are frequently updated requiring
the extract transform load(ETL) process to be continuously re-executed for synchronization
Difficulties also arise in constructing data warehouses when one has only a query interface to
summary data sources and no access to the full data This problem frequently emerges when
integrating several commercial query services like travel or classified advertisement web
applications
As of 2009 the trend in data integration favored loosening the coupling between data[citation
needed] and providing a unified query-interface to access real time data over a mediated schema
(see Figure 2) which allows information to be retrieved directly from original databases This is
consistent with the SOA approach popular in that era This approach relies on mappings between
the mediated schema and the schema of original sources and transforming a query into
specialized queries to match the schema of the original databases Such mappings can be
specified in two ways as a mapping from entities in the mediated schema to entities in the
original sources (the Global As View (GAV) approach) or as a mapping from entities in the
original sources to the mediated schema (the Local As View (LAV) approach) The latter
approach requires more sophisticated inferences to resolve a query on the mediated schema but
makes it easier to add new data sources to a (stable) mediated schema
As of 2010 some of the work in data integration research concerns the semantic
integration problem This problem addresses not the structuring of the architecture of the
integration but how to resolve semantic conflicts between heterogeneous data sources For
example if two companies merge their databases certain concepts and definitions in their
respective schemas like earnings inevitably have different meanings In one database it may
mean profits in dollars (a floating-point number) while in the other it might represent the
number of sales (an integer) A common strategy for the resolution of such problems involves the
use of ontologies which explicitly define schema terms and thus help to resolve semantic
conflicts This approach represents ontology-based data integration On the other hand the
problem of combining research results from different bioinformatics repositories requires bench-
marking of the similarities computed from different data sources on a single criterion such as
positive predictive value This enables the data sources to be directly comparable and can be
integrated even when the natures of experiments are distinct[7]
As of 2011 it was determined that current data modeling methods were imparting data isolation
into every data architecture in the form of islands of disparate data and information silos This
data isolation is an unintended artifact of the data modeling methodology that results in the
development of disparate data models Disparate data models when instantiated as databases
form disparate databases Enhanced data model methodologies have been developed to eliminate
the data isolation artifact and to promote the development of integrated data models[8] One
enhanced data modeling method recasts data models by augmenting them with
structural metadata in the form of standardized data entities As a result of recasting multiple data
models the set of recast data models will now share one or more commonality relationships that
relate the structural metadata now common to these data models Commonality relationships are
a peer-to-peer type of entity relationships that relate the standardized data entities of multiple
data models Multiple data models that contain the same standard data entity may participate in
the same commonality relationship When integrated data models are instantiated as databases
and are properly populated from a common set of master data then these databases are
integrated
Since 2011 data hub approaches have been of greater interest than fully structured (typically
relational) Enterprise Data Warehouses Since 2013 data lakeapproaches have risen to the level
of Data Hubs (See all three search terms popularity on Google Trends[9]) These approaches
combine unstructured or varied data into one location but do not necessarily require an (often
complex) master relational schema to structure and define all data in the Hub
Q2
EITHER
(a) Explain E-R Model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise
an iterative process A team-oriented process with all business managers (or
designates) involved should validate with a ldquobottom-uprdquo approach Has three primary
components entity relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an
independent existence and which can be uniquely identified An entity is an abstraction from the
complexities of some domain When we speak of an entity we normally speak of some aspect of
the real world which can be distinguished from other aspects of the real world An entity may be
a physical object such as a house or a car an event such as a house sale or a car service or a
concept such as a customer transaction or order An entity-type is a category An entity strictly
speaking is an instance of a given entity-type There are usually many instances of an entity-
type Because the term entity-type is somewhat cumbersome most people tend to use the term
entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student
name address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another
Relationships can be thought of as verbs linking two or more nouns Examples an owns
relationship between a company and a computer a supervises relationship between an employee
and a department a performs relationship between an artist and a song a proved relationship
between a mathematician and a theorem Relationships are represented as diamonds connected
by lines to each of the entities in the relationship Types of relationships are as follows
One to many 1lt------- M
Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
(b) Given Entity Customer with attributes customer_id(primary key) name(
first_name last_name middle_name) phone_number date_of_birth
address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number
Entity relationship diagram displays the relationships of entity set stored in a database In other
words we can say that ER diagrams help you to explain the logical structure of databases At
first look an ER diagram looks very similar to the flowchart However ER Diagram includes
many specialized symbols and its meanings make this model unique
Sample ER
Diagram
Facts about ER Diagram Model
o ER model allows you to draw Database Design
o It is an easy to use graphical tool for modeling data
o Widely used in Database Design
o It is a GUI representation of the logical structure of a Database
o It helps you to identifies the entities which exist in a system and the relationships
between those entities
(b)Differentiate between Network and Hierarchical data model in DBMS
Ans Hierarchical model
1 One to many or one to one relationships
2 Based on parent child relationship
3 Retrieve algorithms are complex and asymmetric
4 Data Redundancy more
Network model
1 Many to many relationships
2 Many parents as well as many children
3 Retrieve algorithms are complex and symmetric
4 Data Redundancy more
Relational model
1 One to OneOne to many Many to many relationships
2 Based on relational data structures
3 Retrieve algorithms are simple and symmetric
4 Data Redundancy less
OR
(c)Draw E-R diagram on Library Management System
Ans
(d) State advantages and disadvantages of following file organizations
(i) Index-Sequential file
Ans
Sequential File Organization
1 A sequential file is designed for efficient processing of records in sorted order on some
search key
o Records are chained together by pointers to permit fast retrieval in search key
order
o Pointer points to next record in order
o Records are stored physically in search key order (or as close to this as possible)
o This minimizes number of block accesses
o Figure 1015 shows an example with bname as the search key
2 It is difficult to maintain physical sequential order as records are inserted and deleted
o Deletion can be managed with the pointer chains
o Insertion poses problems if no space where new record should go
o If space use it else put new record in an overflow block
o Adjust pointers accordingly
o Figure 1016 shows the previous example after an insertion
o Problem we now have some records out of physical sequential order
o If very few records in overflow blocks this will work well
o If order is lost reorganize the file
o Reorganizations are expensive and done when system load is low
3 If insertions rarely occur we could keep the file in physically sorted order and reorganize
when insertion occurs In this case the pointer fields are no longer required
The Sequential File
Fixed format used for records
Records are the same length
All fields the same (order and length)
Field names and lengths are attributes of the file
One field is the key filed
Uniquely identifies the record
Records are stored in key sequence
The Sequential File
New records are placed in a log file or transaction file
Batch update is performed to merge the log file with the master file
(ii) Direct file
Direct Access File System (DAFS) is a network file system similar to Network File System
(NFS) and Common Internet File System (CIFS) that allows applications to transfer data while
bypassing operating system control buffering and network protocol operations that can
bottleneck throughput DAFS uses the Virtual Interface (VI) architecture as its underlying
transport mechanism Using VI hardware an application transfers data to and from application
buffers without using the operating system which frees up the processor and operating system
for other processes and allows files to be accessed by servers using several different operating
systems DAFS is designed and optimized for clustered shared-file network environments that
are commonly used for Internet e-commerce and database applications DAFS is optimized for
high-bandwidth InfiniBand networks and it works with any interconnection that supports VI
including Fibre Channel and Ethernet
Network Appliance and Intel formed the DAFS Collaborative as an industry group to specify and
promote DAFS Today more than 85 companies are part of the DAFS
Collaborative
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
o Physical data independence is used to separate conceptual levels from the internal levels
o Physical data independence occurs at the logical interface level
(ii) Data Integration Ans
Data integration involves combining data residing in different sources and providing users with
a unified view of them[1] This process becomes significant in a variety of situations which
include both commercial (such as when two similar companies need to merge their databases)
and scientific (combining research results from different bioinformatics repositories for
example) domains Data integration appears with increasing frequency as the volume (that is big
data[2]) and the need to share existing data explodes[3] It has become the focus of extensive
theoretical work and numerous open problems remain unsolved Data integration encourages
collaboration between internal as well as external users
Figure 1 Simple schematic for a data warehouse The Extract transform load (ETL) process
extracts information from the source databases transforms it and then loads it into the data
warehouse
Figure 2 Simple schematic for a data-integration solution A system designer constructs a
mediated schema against which users can run queries The virtual databaseinterfaces with the
source databases via wrapper code if required
Issues with combining heterogeneous data sources often referred to as information silos under a
single query interface have existed for some time In the early 1980s computer scientists began
designing systems for interoperability of heterogeneous databases[4] The first data integration
system driven by structured metadata was designed at the University of Minnesota in 1991 for
the Integrated Public Use Microdata Series (IPUMS) IPUMS used a data warehousing approach
which extracts transforms and loads data from heterogeneous sources into a single
view schema so data from different sources become compatible[5] By making thousands of
population databases interoperable IPUMS demonstrated the feasibility of large-scale data
integration The data warehouse approach offers a tightly coupled architecture because the data
are already physically reconciled in a single queryable repository so it usually takes little time to
resolve queries[6]
The data warehouse approach is less feasible for data sets that are frequently updated requiring
the extract transform load(ETL) process to be continuously re-executed for synchronization
Difficulties also arise in constructing data warehouses when one has only a query interface to
summary data sources and no access to the full data This problem frequently emerges when
integrating several commercial query services like travel or classified advertisement web
applications
As of 2009 the trend in data integration favored loosening the coupling between data[citation
needed] and providing a unified query-interface to access real time data over a mediated schema
(see Figure 2) which allows information to be retrieved directly from original databases This is
consistent with the SOA approach popular in that era This approach relies on mappings between
the mediated schema and the schema of original sources and transforming a query into
specialized queries to match the schema of the original databases Such mappings can be
specified in two ways as a mapping from entities in the mediated schema to entities in the
original sources (the Global As View (GAV) approach) or as a mapping from entities in the
original sources to the mediated schema (the Local As View (LAV) approach) The latter
approach requires more sophisticated inferences to resolve a query on the mediated schema but
makes it easier to add new data sources to a (stable) mediated schema
As of 2010 some of the work in data integration research concerns the semantic
integration problem This problem addresses not the structuring of the architecture of the
integration but how to resolve semantic conflicts between heterogeneous data sources For
example if two companies merge their databases certain concepts and definitions in their
respective schemas like earnings inevitably have different meanings In one database it may
mean profits in dollars (a floating-point number) while in the other it might represent the
number of sales (an integer) A common strategy for the resolution of such problems involves the
use of ontologies which explicitly define schema terms and thus help to resolve semantic
conflicts This approach represents ontology-based data integration On the other hand the
problem of combining research results from different bioinformatics repositories requires bench-
marking of the similarities computed from different data sources on a single criterion such as
positive predictive value This enables the data sources to be directly comparable and can be
integrated even when the natures of experiments are distinct[7]
As of 2011 it was determined that current data modeling methods were imparting data isolation
into every data architecture in the form of islands of disparate data and information silos This
data isolation is an unintended artifact of the data modeling methodology that results in the
development of disparate data models Disparate data models when instantiated as databases
form disparate databases Enhanced data model methodologies have been developed to eliminate
the data isolation artifact and to promote the development of integrated data models[8] One
enhanced data modeling method recasts data models by augmenting them with
structural metadata in the form of standardized data entities As a result of recasting multiple data
models the set of recast data models will now share one or more commonality relationships that
relate the structural metadata now common to these data models Commonality relationships are
a peer-to-peer type of entity relationships that relate the standardized data entities of multiple
data models Multiple data models that contain the same standard data entity may participate in
the same commonality relationship When integrated data models are instantiated as databases
and are properly populated from a common set of master data then these databases are
integrated
Since 2011 data hub approaches have been of greater interest than fully structured (typically
relational) Enterprise Data Warehouses Since 2013 data lakeapproaches have risen to the level
of Data Hubs (See all three search terms popularity on Google Trends[9]) These approaches
combine unstructured or varied data into one location but do not necessarily require an (often
complex) master relational schema to structure and define all data in the Hub
Q2
EITHER
(a) Explain E-R Model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise
an iterative process A team-oriented process with all business managers (or
designates) involved should validate with a ldquobottom-uprdquo approach Has three primary
components entity relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an
independent existence and which can be uniquely identified An entity is an abstraction from the
complexities of some domain When we speak of an entity we normally speak of some aspect of
the real world which can be distinguished from other aspects of the real world An entity may be
a physical object such as a house or a car an event such as a house sale or a car service or a
concept such as a customer transaction or order An entity-type is a category An entity strictly
speaking is an instance of a given entity-type There are usually many instances of an entity-
type Because the term entity-type is somewhat cumbersome most people tend to use the term
entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student
name address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another
Relationships can be thought of as verbs linking two or more nouns Examples an owns
relationship between a company and a computer a supervises relationship between an employee
and a department a performs relationship between an artist and a song a proved relationship
between a mathematician and a theorem Relationships are represented as diamonds connected
by lines to each of the entities in the relationship Types of relationships are as follows
One to many 1lt------- M
Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
(b) Given Entity Customer with attributes customer_id(primary key) name(
first_name last_name middle_name) phone_number date_of_birth
address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number
Entity relationship diagram displays the relationships of entity set stored in a database In other
words we can say that ER diagrams help you to explain the logical structure of databases At
first look an ER diagram looks very similar to the flowchart However ER Diagram includes
many specialized symbols and its meanings make this model unique
Sample ER
Diagram
Facts about ER Diagram Model
o ER model allows you to draw Database Design
o It is an easy to use graphical tool for modeling data
o Widely used in Database Design
o It is a GUI representation of the logical structure of a Database
o It helps you to identifies the entities which exist in a system and the relationships
between those entities
(b)Differentiate between Network and Hierarchical data model in DBMS
Ans Hierarchical model
1 One to many or one to one relationships
2 Based on parent child relationship
3 Retrieve algorithms are complex and asymmetric
4 Data Redundancy more
Network model
1 Many to many relationships
2 Many parents as well as many children
3 Retrieve algorithms are complex and symmetric
4 Data Redundancy more
Relational model
1 One to OneOne to many Many to many relationships
2 Based on relational data structures
3 Retrieve algorithms are simple and symmetric
4 Data Redundancy less
OR
(c)Draw E-R diagram on Library Management System
Ans
(d) State advantages and disadvantages of following file organizations
(i) Index-Sequential file
Ans
Sequential File Organization
1 A sequential file is designed for efficient processing of records in sorted order on some
search key
o Records are chained together by pointers to permit fast retrieval in search key
order
o Pointer points to next record in order
o Records are stored physically in search key order (or as close to this as possible)
o This minimizes number of block accesses
o Figure 1015 shows an example with bname as the search key
2 It is difficult to maintain physical sequential order as records are inserted and deleted
o Deletion can be managed with the pointer chains
o Insertion poses problems if no space where new record should go
o If space use it else put new record in an overflow block
o Adjust pointers accordingly
o Figure 1016 shows the previous example after an insertion
o Problem we now have some records out of physical sequential order
o If very few records in overflow blocks this will work well
o If order is lost reorganize the file
o Reorganizations are expensive and done when system load is low
3 If insertions rarely occur we could keep the file in physically sorted order and reorganize
when insertion occurs In this case the pointer fields are no longer required
The Sequential File
Fixed format used for records
Records are the same length
All fields the same (order and length)
Field names and lengths are attributes of the file
One field is the key filed
Uniquely identifies the record
Records are stored in key sequence
The Sequential File
New records are placed in a log file or transaction file
Batch update is performed to merge the log file with the master file
(ii) Direct file
Direct Access File System (DAFS) is a network file system similar to Network File System
(NFS) and Common Internet File System (CIFS) that allows applications to transfer data while
bypassing operating system control buffering and network protocol operations that can
bottleneck throughput DAFS uses the Virtual Interface (VI) architecture as its underlying
transport mechanism Using VI hardware an application transfers data to and from application
buffers without using the operating system which frees up the processor and operating system
for other processes and allows files to be accessed by servers using several different operating
systems DAFS is designed and optimized for clustered shared-file network environments that
are commonly used for Internet e-commerce and database applications DAFS is optimized for
high-bandwidth InfiniBand networks and it works with any interconnection that supports VI
including Fibre Channel and Ethernet
Network Appliance and Intel formed the DAFS Collaborative as an industry group to specify and
promote DAFS Today more than 85 companies are part of the DAFS
Collaborative
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Issues with combining heterogeneous data sources often referred to as information silos under a
single query interface have existed for some time In the early 1980s computer scientists began
designing systems for interoperability of heterogeneous databases[4] The first data integration
system driven by structured metadata was designed at the University of Minnesota in 1991 for
the Integrated Public Use Microdata Series (IPUMS) IPUMS used a data warehousing approach
which extracts transforms and loads data from heterogeneous sources into a single
view schema so data from different sources become compatible[5] By making thousands of
population databases interoperable IPUMS demonstrated the feasibility of large-scale data
integration The data warehouse approach offers a tightly coupled architecture because the data
are already physically reconciled in a single queryable repository so it usually takes little time to
resolve queries[6]
The data warehouse approach is less feasible for data sets that are frequently updated requiring
the extract transform load(ETL) process to be continuously re-executed for synchronization
Difficulties also arise in constructing data warehouses when one has only a query interface to
summary data sources and no access to the full data This problem frequently emerges when
integrating several commercial query services like travel or classified advertisement web
applications
As of 2009 the trend in data integration favored loosening the coupling between data[citation
needed] and providing a unified query-interface to access real time data over a mediated schema
(see Figure 2) which allows information to be retrieved directly from original databases This is
consistent with the SOA approach popular in that era This approach relies on mappings between
the mediated schema and the schema of original sources and transforming a query into
specialized queries to match the schema of the original databases Such mappings can be
specified in two ways as a mapping from entities in the mediated schema to entities in the
original sources (the Global As View (GAV) approach) or as a mapping from entities in the
original sources to the mediated schema (the Local As View (LAV) approach) The latter
approach requires more sophisticated inferences to resolve a query on the mediated schema but
makes it easier to add new data sources to a (stable) mediated schema
As of 2010 some of the work in data integration research concerns the semantic
integration problem This problem addresses not the structuring of the architecture of the
integration but how to resolve semantic conflicts between heterogeneous data sources For
example if two companies merge their databases certain concepts and definitions in their
respective schemas like earnings inevitably have different meanings In one database it may
mean profits in dollars (a floating-point number) while in the other it might represent the
number of sales (an integer) A common strategy for the resolution of such problems involves the
use of ontologies which explicitly define schema terms and thus help to resolve semantic
conflicts This approach represents ontology-based data integration On the other hand the
problem of combining research results from different bioinformatics repositories requires bench-
marking of the similarities computed from different data sources on a single criterion such as
positive predictive value This enables the data sources to be directly comparable and can be
integrated even when the natures of experiments are distinct[7]
As of 2011 it was determined that current data modeling methods were imparting data isolation
into every data architecture in the form of islands of disparate data and information silos This
data isolation is an unintended artifact of the data modeling methodology that results in the
development of disparate data models Disparate data models when instantiated as databases
form disparate databases Enhanced data model methodologies have been developed to eliminate
the data isolation artifact and to promote the development of integrated data models[8] One
enhanced data modeling method recasts data models by augmenting them with
structural metadata in the form of standardized data entities As a result of recasting multiple data
models the set of recast data models will now share one or more commonality relationships that
relate the structural metadata now common to these data models Commonality relationships are
a peer-to-peer type of entity relationships that relate the standardized data entities of multiple
data models Multiple data models that contain the same standard data entity may participate in
the same commonality relationship When integrated data models are instantiated as databases
and are properly populated from a common set of master data then these databases are
integrated
Since 2011 data hub approaches have been of greater interest than fully structured (typically
relational) Enterprise Data Warehouses Since 2013 data lakeapproaches have risen to the level
of Data Hubs (See all three search terms popularity on Google Trends[9]) These approaches
combine unstructured or varied data into one location but do not necessarily require an (often
complex) master relational schema to structure and define all data in the Hub
Q2
EITHER
(a) Explain E-R Model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise
an iterative process A team-oriented process with all business managers (or
designates) involved should validate with a ldquobottom-uprdquo approach Has three primary
components entity relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an
independent existence and which can be uniquely identified An entity is an abstraction from the
complexities of some domain When we speak of an entity we normally speak of some aspect of
the real world which can be distinguished from other aspects of the real world An entity may be
a physical object such as a house or a car an event such as a house sale or a car service or a
concept such as a customer transaction or order An entity-type is a category An entity strictly
speaking is an instance of a given entity-type There are usually many instances of an entity-
type Because the term entity-type is somewhat cumbersome most people tend to use the term
entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student
name address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another
Relationships can be thought of as verbs linking two or more nouns Examples an owns
relationship between a company and a computer a supervises relationship between an employee
and a department a performs relationship between an artist and a song a proved relationship
between a mathematician and a theorem Relationships are represented as diamonds connected
by lines to each of the entities in the relationship Types of relationships are as follows
One to many 1lt------- M
Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
(b) Given Entity Customer with attributes customer_id(primary key) name(
first_name last_name middle_name) phone_number date_of_birth
address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number
Entity relationship diagram displays the relationships of entity set stored in a database In other
words we can say that ER diagrams help you to explain the logical structure of databases At
first look an ER diagram looks very similar to the flowchart However ER Diagram includes
many specialized symbols and its meanings make this model unique
Sample ER
Diagram
Facts about ER Diagram Model
o ER model allows you to draw Database Design
o It is an easy to use graphical tool for modeling data
o Widely used in Database Design
o It is a GUI representation of the logical structure of a Database
o It helps you to identifies the entities which exist in a system and the relationships
between those entities
(b)Differentiate between Network and Hierarchical data model in DBMS
Ans Hierarchical model
1 One to many or one to one relationships
2 Based on parent child relationship
3 Retrieve algorithms are complex and asymmetric
4 Data Redundancy more
Network model
1 Many to many relationships
2 Many parents as well as many children
3 Retrieve algorithms are complex and symmetric
4 Data Redundancy more
Relational model
1 One to OneOne to many Many to many relationships
2 Based on relational data structures
3 Retrieve algorithms are simple and symmetric
4 Data Redundancy less
OR
(c)Draw E-R diagram on Library Management System
Ans
(d) State advantages and disadvantages of following file organizations
(i) Index-Sequential file
Ans
Sequential File Organization
1 A sequential file is designed for efficient processing of records in sorted order on some
search key
o Records are chained together by pointers to permit fast retrieval in search key
order
o Pointer points to next record in order
o Records are stored physically in search key order (or as close to this as possible)
o This minimizes number of block accesses
o Figure 1015 shows an example with bname as the search key
2 It is difficult to maintain physical sequential order as records are inserted and deleted
o Deletion can be managed with the pointer chains
o Insertion poses problems if no space where new record should go
o If space use it else put new record in an overflow block
o Adjust pointers accordingly
o Figure 1016 shows the previous example after an insertion
o Problem we now have some records out of physical sequential order
o If very few records in overflow blocks this will work well
o If order is lost reorganize the file
o Reorganizations are expensive and done when system load is low
3 If insertions rarely occur we could keep the file in physically sorted order and reorganize
when insertion occurs In this case the pointer fields are no longer required
The Sequential File
Fixed format used for records
Records are the same length
All fields the same (order and length)
Field names and lengths are attributes of the file
One field is the key filed
Uniquely identifies the record
Records are stored in key sequence
The Sequential File
New records are placed in a log file or transaction file
Batch update is performed to merge the log file with the master file
(ii) Direct file
Direct Access File System (DAFS) is a network file system similar to Network File System
(NFS) and Common Internet File System (CIFS) that allows applications to transfer data while
bypassing operating system control buffering and network protocol operations that can
bottleneck throughput DAFS uses the Virtual Interface (VI) architecture as its underlying
transport mechanism Using VI hardware an application transfers data to and from application
buffers without using the operating system which frees up the processor and operating system
for other processes and allows files to be accessed by servers using several different operating
systems DAFS is designed and optimized for clustered shared-file network environments that
are commonly used for Internet e-commerce and database applications DAFS is optimized for
high-bandwidth InfiniBand networks and it works with any interconnection that supports VI
including Fibre Channel and Ethernet
Network Appliance and Intel formed the DAFS Collaborative as an industry group to specify and
promote DAFS Today more than 85 companies are part of the DAFS
Collaborative
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
form disparate databases Enhanced data model methodologies have been developed to eliminate
the data isolation artifact and to promote the development of integrated data models[8] One
enhanced data modeling method recasts data models by augmenting them with
structural metadata in the form of standardized data entities As a result of recasting multiple data
models the set of recast data models will now share one or more commonality relationships that
relate the structural metadata now common to these data models Commonality relationships are
a peer-to-peer type of entity relationships that relate the standardized data entities of multiple
data models Multiple data models that contain the same standard data entity may participate in
the same commonality relationship When integrated data models are instantiated as databases
and are properly populated from a common set of master data then these databases are
integrated
Since 2011 data hub approaches have been of greater interest than fully structured (typically
relational) Enterprise Data Warehouses Since 2013 data lakeapproaches have risen to the level
of Data Hubs (See all three search terms popularity on Google Trends[9]) These approaches
combine unstructured or varied data into one location but do not necessarily require an (often
complex) master relational schema to structure and define all data in the Hub
Q2
EITHER
(a) Explain E-R Model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise
an iterative process A team-oriented process with all business managers (or
designates) involved should validate with a ldquobottom-uprdquo approach Has three primary
components entity relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an
independent existence and which can be uniquely identified An entity is an abstraction from the
complexities of some domain When we speak of an entity we normally speak of some aspect of
the real world which can be distinguished from other aspects of the real world An entity may be
a physical object such as a house or a car an event such as a house sale or a car service or a
concept such as a customer transaction or order An entity-type is a category An entity strictly
speaking is an instance of a given entity-type There are usually many instances of an entity-
type Because the term entity-type is somewhat cumbersome most people tend to use the term
entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student
name address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another
Relationships can be thought of as verbs linking two or more nouns Examples an owns
relationship between a company and a computer a supervises relationship between an employee
and a department a performs relationship between an artist and a song a proved relationship
between a mathematician and a theorem Relationships are represented as diamonds connected
by lines to each of the entities in the relationship Types of relationships are as follows
One to many 1lt------- M
Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
(b) Given Entity Customer with attributes customer_id(primary key) name(
first_name last_name middle_name) phone_number date_of_birth
address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number
Entity relationship diagram displays the relationships of entity set stored in a database In other
words we can say that ER diagrams help you to explain the logical structure of databases At
first look an ER diagram looks very similar to the flowchart However ER Diagram includes
many specialized symbols and its meanings make this model unique
Sample ER
Diagram
Facts about ER Diagram Model
o ER model allows you to draw Database Design
o It is an easy to use graphical tool for modeling data
o Widely used in Database Design
o It is a GUI representation of the logical structure of a Database
o It helps you to identifies the entities which exist in a system and the relationships
between those entities
(b)Differentiate between Network and Hierarchical data model in DBMS
Ans Hierarchical model
1 One to many or one to one relationships
2 Based on parent child relationship
3 Retrieve algorithms are complex and asymmetric
4 Data Redundancy more
Network model
1 Many to many relationships
2 Many parents as well as many children
3 Retrieve algorithms are complex and symmetric
4 Data Redundancy more
Relational model
1 One to OneOne to many Many to many relationships
2 Based on relational data structures
3 Retrieve algorithms are simple and symmetric
4 Data Redundancy less
OR
(c)Draw E-R diagram on Library Management System
Ans
(d) State advantages and disadvantages of following file organizations
(i) Index-Sequential file
Ans
Sequential File Organization
1 A sequential file is designed for efficient processing of records in sorted order on some
search key
o Records are chained together by pointers to permit fast retrieval in search key
order
o Pointer points to next record in order
o Records are stored physically in search key order (or as close to this as possible)
o This minimizes number of block accesses
o Figure 1015 shows an example with bname as the search key
2 It is difficult to maintain physical sequential order as records are inserted and deleted
o Deletion can be managed with the pointer chains
o Insertion poses problems if no space where new record should go
o If space use it else put new record in an overflow block
o Adjust pointers accordingly
o Figure 1016 shows the previous example after an insertion
o Problem we now have some records out of physical sequential order
o If very few records in overflow blocks this will work well
o If order is lost reorganize the file
o Reorganizations are expensive and done when system load is low
3 If insertions rarely occur we could keep the file in physically sorted order and reorganize
when insertion occurs In this case the pointer fields are no longer required
The Sequential File
Fixed format used for records
Records are the same length
All fields the same (order and length)
Field names and lengths are attributes of the file
One field is the key filed
Uniquely identifies the record
Records are stored in key sequence
The Sequential File
New records are placed in a log file or transaction file
Batch update is performed to merge the log file with the master file
(ii) Direct file
Direct Access File System (DAFS) is a network file system similar to Network File System
(NFS) and Common Internet File System (CIFS) that allows applications to transfer data while
bypassing operating system control buffering and network protocol operations that can
bottleneck throughput DAFS uses the Virtual Interface (VI) architecture as its underlying
transport mechanism Using VI hardware an application transfers data to and from application
buffers without using the operating system which frees up the processor and operating system
for other processes and allows files to be accessed by servers using several different operating
systems DAFS is designed and optimized for clustered shared-file network environments that
are commonly used for Internet e-commerce and database applications DAFS is optimized for
high-bandwidth InfiniBand networks and it works with any interconnection that supports VI
including Fibre Channel and Ethernet
Network Appliance and Intel formed the DAFS Collaborative as an industry group to specify and
promote DAFS Today more than 85 companies are part of the DAFS
Collaborative
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another
Relationships can be thought of as verbs linking two or more nouns Examples an owns
relationship between a company and a computer a supervises relationship between an employee
and a department a performs relationship between an artist and a song a proved relationship
between a mathematician and a theorem Relationships are represented as diamonds connected
by lines to each of the entities in the relationship Types of relationships are as follows
One to many 1lt------- M
Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
(b) Given Entity Customer with attributes customer_id(primary key) name(
first_name last_name middle_name) phone_number date_of_birth
address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number
Entity relationship diagram displays the relationships of entity set stored in a database In other
words we can say that ER diagrams help you to explain the logical structure of databases At
first look an ER diagram looks very similar to the flowchart However ER Diagram includes
many specialized symbols and its meanings make this model unique
Sample ER
Diagram
Facts about ER Diagram Model
o ER model allows you to draw Database Design
o It is an easy to use graphical tool for modeling data
o Widely used in Database Design
o It is a GUI representation of the logical structure of a Database
o It helps you to identifies the entities which exist in a system and the relationships
between those entities
(b)Differentiate between Network and Hierarchical data model in DBMS
Ans Hierarchical model
1 One to many or one to one relationships
2 Based on parent child relationship
3 Retrieve algorithms are complex and asymmetric
4 Data Redundancy more
Network model
1 Many to many relationships
2 Many parents as well as many children
3 Retrieve algorithms are complex and symmetric
4 Data Redundancy more
Relational model
1 One to OneOne to many Many to many relationships
2 Based on relational data structures
3 Retrieve algorithms are simple and symmetric
4 Data Redundancy less
OR
(c)Draw E-R diagram on Library Management System
Ans
(d) State advantages and disadvantages of following file organizations
(i) Index-Sequential file
Ans
Sequential File Organization
1 A sequential file is designed for efficient processing of records in sorted order on some
search key
o Records are chained together by pointers to permit fast retrieval in search key
order
o Pointer points to next record in order
o Records are stored physically in search key order (or as close to this as possible)
o This minimizes number of block accesses
o Figure 1015 shows an example with bname as the search key
2 It is difficult to maintain physical sequential order as records are inserted and deleted
o Deletion can be managed with the pointer chains
o Insertion poses problems if no space where new record should go
o If space use it else put new record in an overflow block
o Adjust pointers accordingly
o Figure 1016 shows the previous example after an insertion
o Problem we now have some records out of physical sequential order
o If very few records in overflow blocks this will work well
o If order is lost reorganize the file
o Reorganizations are expensive and done when system load is low
3 If insertions rarely occur we could keep the file in physically sorted order and reorganize
when insertion occurs In this case the pointer fields are no longer required
The Sequential File
Fixed format used for records
Records are the same length
All fields the same (order and length)
Field names and lengths are attributes of the file
One field is the key filed
Uniquely identifies the record
Records are stored in key sequence
The Sequential File
New records are placed in a log file or transaction file
Batch update is performed to merge the log file with the master file
(ii) Direct file
Direct Access File System (DAFS) is a network file system similar to Network File System
(NFS) and Common Internet File System (CIFS) that allows applications to transfer data while
bypassing operating system control buffering and network protocol operations that can
bottleneck throughput DAFS uses the Virtual Interface (VI) architecture as its underlying
transport mechanism Using VI hardware an application transfers data to and from application
buffers without using the operating system which frees up the processor and operating system
for other processes and allows files to be accessed by servers using several different operating
systems DAFS is designed and optimized for clustered shared-file network environments that
are commonly used for Internet e-commerce and database applications DAFS is optimized for
high-bandwidth InfiniBand networks and it works with any interconnection that supports VI
including Fibre Channel and Ethernet
Network Appliance and Intel formed the DAFS Collaborative as an industry group to specify and
promote DAFS Today more than 85 companies are part of the DAFS
Collaborative
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Sample ER
Diagram
Facts about ER Diagram Model
o ER model allows you to draw Database Design
o It is an easy to use graphical tool for modeling data
o Widely used in Database Design
o It is a GUI representation of the logical structure of a Database
o It helps you to identifies the entities which exist in a system and the relationships
between those entities
(b)Differentiate between Network and Hierarchical data model in DBMS
Ans Hierarchical model
1 One to many or one to one relationships
2 Based on parent child relationship
3 Retrieve algorithms are complex and asymmetric
4 Data Redundancy more
Network model
1 Many to many relationships
2 Many parents as well as many children
3 Retrieve algorithms are complex and symmetric
4 Data Redundancy more
Relational model
1 One to OneOne to many Many to many relationships
2 Based on relational data structures
3 Retrieve algorithms are simple and symmetric
4 Data Redundancy less
OR
(c)Draw E-R diagram on Library Management System
Ans
(d) State advantages and disadvantages of following file organizations
(i) Index-Sequential file
Ans
Sequential File Organization
1 A sequential file is designed for efficient processing of records in sorted order on some
search key
o Records are chained together by pointers to permit fast retrieval in search key
order
o Pointer points to next record in order
o Records are stored physically in search key order (or as close to this as possible)
o This minimizes number of block accesses
o Figure 1015 shows an example with bname as the search key
2 It is difficult to maintain physical sequential order as records are inserted and deleted
o Deletion can be managed with the pointer chains
o Insertion poses problems if no space where new record should go
o If space use it else put new record in an overflow block
o Adjust pointers accordingly
o Figure 1016 shows the previous example after an insertion
o Problem we now have some records out of physical sequential order
o If very few records in overflow blocks this will work well
o If order is lost reorganize the file
o Reorganizations are expensive and done when system load is low
3 If insertions rarely occur we could keep the file in physically sorted order and reorganize
when insertion occurs In this case the pointer fields are no longer required
The Sequential File
Fixed format used for records
Records are the same length
All fields the same (order and length)
Field names and lengths are attributes of the file
One field is the key filed
Uniquely identifies the record
Records are stored in key sequence
The Sequential File
New records are placed in a log file or transaction file
Batch update is performed to merge the log file with the master file
(ii) Direct file
Direct Access File System (DAFS) is a network file system similar to Network File System
(NFS) and Common Internet File System (CIFS) that allows applications to transfer data while
bypassing operating system control buffering and network protocol operations that can
bottleneck throughput DAFS uses the Virtual Interface (VI) architecture as its underlying
transport mechanism Using VI hardware an application transfers data to and from application
buffers without using the operating system which frees up the processor and operating system
for other processes and allows files to be accessed by servers using several different operating
systems DAFS is designed and optimized for clustered shared-file network environments that
are commonly used for Internet e-commerce and database applications DAFS is optimized for
high-bandwidth InfiniBand networks and it works with any interconnection that supports VI
including Fibre Channel and Ethernet
Network Appliance and Intel formed the DAFS Collaborative as an industry group to specify and
promote DAFS Today more than 85 companies are part of the DAFS
Collaborative
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
(b)Differentiate between Network and Hierarchical data model in DBMS
Ans Hierarchical model
1 One to many or one to one relationships
2 Based on parent child relationship
3 Retrieve algorithms are complex and asymmetric
4 Data Redundancy more
Network model
1 Many to many relationships
2 Many parents as well as many children
3 Retrieve algorithms are complex and symmetric
4 Data Redundancy more
Relational model
1 One to OneOne to many Many to many relationships
2 Based on relational data structures
3 Retrieve algorithms are simple and symmetric
4 Data Redundancy less
OR
(c)Draw E-R diagram on Library Management System
Ans
(d) State advantages and disadvantages of following file organizations
(i) Index-Sequential file
Ans
Sequential File Organization
1 A sequential file is designed for efficient processing of records in sorted order on some
search key
o Records are chained together by pointers to permit fast retrieval in search key
order
o Pointer points to next record in order
o Records are stored physically in search key order (or as close to this as possible)
o This minimizes number of block accesses
o Figure 1015 shows an example with bname as the search key
2 It is difficult to maintain physical sequential order as records are inserted and deleted
o Deletion can be managed with the pointer chains
o Insertion poses problems if no space where new record should go
o If space use it else put new record in an overflow block
o Adjust pointers accordingly
o Figure 1016 shows the previous example after an insertion
o Problem we now have some records out of physical sequential order
o If very few records in overflow blocks this will work well
o If order is lost reorganize the file
o Reorganizations are expensive and done when system load is low
3 If insertions rarely occur we could keep the file in physically sorted order and reorganize
when insertion occurs In this case the pointer fields are no longer required
The Sequential File
Fixed format used for records
Records are the same length
All fields the same (order and length)
Field names and lengths are attributes of the file
One field is the key filed
Uniquely identifies the record
Records are stored in key sequence
The Sequential File
New records are placed in a log file or transaction file
Batch update is performed to merge the log file with the master file
(ii) Direct file
Direct Access File System (DAFS) is a network file system similar to Network File System
(NFS) and Common Internet File System (CIFS) that allows applications to transfer data while
bypassing operating system control buffering and network protocol operations that can
bottleneck throughput DAFS uses the Virtual Interface (VI) architecture as its underlying
transport mechanism Using VI hardware an application transfers data to and from application
buffers without using the operating system which frees up the processor and operating system
for other processes and allows files to be accessed by servers using several different operating
systems DAFS is designed and optimized for clustered shared-file network environments that
are commonly used for Internet e-commerce and database applications DAFS is optimized for
high-bandwidth InfiniBand networks and it works with any interconnection that supports VI
including Fibre Channel and Ethernet
Network Appliance and Intel formed the DAFS Collaborative as an industry group to specify and
promote DAFS Today more than 85 companies are part of the DAFS
Collaborative
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
(d) State advantages and disadvantages of following file organizations
(i) Index-Sequential file
Ans
Sequential File Organization
1 A sequential file is designed for efficient processing of records in sorted order on some
search key
o Records are chained together by pointers to permit fast retrieval in search key
order
o Pointer points to next record in order
o Records are stored physically in search key order (or as close to this as possible)
o This minimizes number of block accesses
o Figure 1015 shows an example with bname as the search key
2 It is difficult to maintain physical sequential order as records are inserted and deleted
o Deletion can be managed with the pointer chains
o Insertion poses problems if no space where new record should go
o If space use it else put new record in an overflow block
o Adjust pointers accordingly
o Figure 1016 shows the previous example after an insertion
o Problem we now have some records out of physical sequential order
o If very few records in overflow blocks this will work well
o If order is lost reorganize the file
o Reorganizations are expensive and done when system load is low
3 If insertions rarely occur we could keep the file in physically sorted order and reorganize
when insertion occurs In this case the pointer fields are no longer required
The Sequential File
Fixed format used for records
Records are the same length
All fields the same (order and length)
Field names and lengths are attributes of the file
One field is the key filed
Uniquely identifies the record
Records are stored in key sequence
The Sequential File
New records are placed in a log file or transaction file
Batch update is performed to merge the log file with the master file
(ii) Direct file
Direct Access File System (DAFS) is a network file system similar to Network File System
(NFS) and Common Internet File System (CIFS) that allows applications to transfer data while
bypassing operating system control buffering and network protocol operations that can
bottleneck throughput DAFS uses the Virtual Interface (VI) architecture as its underlying
transport mechanism Using VI hardware an application transfers data to and from application
buffers without using the operating system which frees up the processor and operating system
for other processes and allows files to be accessed by servers using several different operating
systems DAFS is designed and optimized for clustered shared-file network environments that
are commonly used for Internet e-commerce and database applications DAFS is optimized for
high-bandwidth InfiniBand networks and it works with any interconnection that supports VI
including Fibre Channel and Ethernet
Network Appliance and Intel formed the DAFS Collaborative as an industry group to specify and
promote DAFS Today more than 85 companies are part of the DAFS
Collaborative
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
New records are placed in a log file or transaction file
Batch update is performed to merge the log file with the master file
(ii) Direct file
Direct Access File System (DAFS) is a network file system similar to Network File System
(NFS) and Common Internet File System (CIFS) that allows applications to transfer data while
bypassing operating system control buffering and network protocol operations that can
bottleneck throughput DAFS uses the Virtual Interface (VI) architecture as its underlying
transport mechanism Using VI hardware an application transfers data to and from application
buffers without using the operating system which frees up the processor and operating system
for other processes and allows files to be accessed by servers using several different operating
systems DAFS is designed and optimized for clustered shared-file network environments that
are commonly used for Internet e-commerce and database applications DAFS is optimized for
high-bandwidth InfiniBand networks and it works with any interconnection that supports VI
including Fibre Channel and Ethernet
Network Appliance and Intel formed the DAFS Collaborative as an industry group to specify and
promote DAFS Today more than 85 companies are part of the DAFS
Collaborative
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Q3
EITHER
(a) Explain tuple relational calculus
Ans
Relational Calculus
Relational calculus query specifies what is to be retrieved rather than how to retrieve it
No description of how to evaluate a query
In first-order logic (or predicate calculus) predicate is a truth-valued function
with arguments
When we substitute values for the arguments function yields an expression
called a proposition which can be either true or false
Relational Calculus
If predicate contains a variable (eg lsquox is a member of staffrsquo) there must be a range for x
When we substitute some values of this range for x proposition may be true for
other values it may be false
When applied to databases relational calculus has forms tuple and domain
Tuple Relational Calculus
Interested in finding tuples for which a predicate is true Based on use of tuple variables
Tuple variable is a variable that lsquoranges overrsquo a named relation ie variable
whose only permitted values are tuples of the relation
Specify range of a tuple variable S as the Staff relation as
Staff(S)
To find set of all tuples S such that P(S) is true
S | P(S)
Tuple Relational Calculus - Example
To find details of all staff earning more than $10000
S | Staff(S) Ssalary gt 10000
To find a particular attribute such as salary write
Ssalary | Staff(S) Ssalary gt 10000
Tuple Relational Calculus
Can use two quantifiers to tell how many instances the predicate applies to
Existential quantifier $ (lsquothere existsrsquo)
Universal quantifier (lsquofor allrsquo)
Tuple variables qualified by or $ are called bound variables otherwise called
free variables
Tuple Relational Calculus
Existential quantifier used in formulae that must be true for at least one instance such as
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Staff(S) Ugrave ($B)(Branch(B) Ugrave
(BbranchNo = SbranchNo) Ugrave Bcity = lsquoLondonrsquo)
Means lsquoThere exists a Branch tuple with same branchNo as the branchNo of the current
Staff tuple S and is located in Londonrsquo
Tuple Relational Calculus
Universal quantifier is used in statements about every instance such as
(B) (Bcity lsquoParisrsquo)
Means lsquoFor all Branch tuples the address is not in Parisrsquo
Can also use ~($B) (Bcity = lsquoParisrsquo) which means lsquoThere are no branches with an
address in Parisrsquo
Tuple Relational Calculus
Formulae should be unambiguous and make sense
A (well-formed) formula is made out of atoms
R(Si) where Si is a tuple variable and R is a relation
Sia1 q Sja2
Sia1 q c
Can recursively build up formulae from atoms
An atom is a formula
If F1 and F2 are formulae so are their conjunction F1 Ugrave F2 disjunction
F1 Uacute F2 and negation ~F1
If F is a formula with free variable X then ($X)(F) and (X)(F) are also
formulae
Example - Tuple Relational Calculus
a) List the names of all managers who earn more than $25000
SfName SlName | Staff(S)
Sposition = lsquoManagerrsquo Ssalary gt 25000
b) List the staff who manage properties for rent in Glasgow
S | Staff(S) ($P) (PropertyForRent(P) (PstaffNo = SstaffNo) Ugrave Pcity = lsquoGlasgowrsquo)
Tuple Relational Calculus
Expressions can generate an infinite set For example
S | ~Staff(S)
To avoid this add restriction that all values in result must be values in the domain
of the expression
Data Manipulations in SQL
Select Update Delete Insert Statement
Basic Data retrieval
Condition Specification
Arithmetic and Aggregate operators
SQL Join Multiple Table Queries
Set Manipulation
Any In Contains All Not In Not Contains Exists Union Minus Intersect
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Categorization
Updates
Creating Tables
Empty tables are constructed using the CREATE TABLE statement
Data must be entered later using INSERT
CREATE TABLE S ( SNO CHAR(5)
SNAME CHAR(20)
STATUS DECIMAL(3)
CITY CHAR(15)
PRIMARY KEY (SNO) )
Creating Tables
A table name and unique column names must be specified
Columns which are defined as primary keys will never have two rows with the same key
value
Primary key may consist of more than one column (values unique in combination)
called composite key
(b) Explain Data Manipulation in SQL
Ans
A data manipulation language (DML) is a computer programming language used for adding
(inserting) deleting and modifying (updating) data in a database A DML is often
a sublanguage of a broader database language such as SQL with the DML comprising some of
the operators in the language[1] Read-only selecting of data is sometimes distinguished as being
part of a separate data query language (DQL) but it is closely related and sometimes also
considered a component of a DML some operators may perform both selecting (reading) and
writing
A popular data manipulation language is that of Structured Query Language (SQL) which is
used to retrieve and manipulate data in a relational database[2] Other forms of DML are those
used by IMSDLI CODASYL databases such as IDMS and others
In SQL the data manipulation language comprises the SQL-data change statements[3] which
modify stored data but not the schema or database objects Manipulation of persistent database
objects eg tables or stored procedures via the SQL schema statements[3] rather than the data
stored within them is considered to be part of a separate data definition language (DDL) In SQL
these two categories are similar in their detailed syntax data types expressions etc but distinct
in their overall function[3]
The SQL-data change statements are a subset of the SQL-data statements this also contains
the SELECT query statement[3] which strictly speaking is part of the DQL not the DML In
common practice though this distinction is not made and SELECT is widely considered to be
part of DML[4] so the DML consists of all SQL-datastatements not only the SQL-data
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
change statements The SELECT INTO form combines both selection and manipulation
and thus is strictly considered to be DML because it manipulates (ie modifies) data
Data manipulation languages have their functional capability organized by the initial word in a
statement which is almost always a verb In the case of SQL these verbs are
SELECT FROM WHERE (strictly speaking DQL)
SELECT INTO
INSERT INTO VALUES
UPDATE SET WHERE
DELETE FROM WHERE
For example the command to insert a row into table employees
INSERT INTO employees (first_name last_name fname) VALUES (John Capita
xcapit00)
OR
(c) Explain following integrity rules
(i) Entity Integrity
Integrity Rules are imperative to a good database design Most RDBMS have
these rules automatically but it is safer to just make sure that the rules are
already applied in the design There are two types of integrity mentioned in
integrity rules entity and reference Two additional rules that arent
necessarily included in integrity rules but are pertinent to database designs
are business rules and domain rules
Entity integrity exists when each primary key within a table has a value that
is unique this ensures that each row is uniquely identified by the primary
keyOne requirement for entity integrity is that a primary key cannot have a
null value The purpose of this integrity is to have each row to have a unique
identity and foreign key values can properly reference primary key values
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Theta Join
In theta join we apply the condition on input relation(s) and then only those
selected
rows are used in the cross product to
be merged and included in the output It means
that in normal cross product all the rows of one relation are mappedmerged
with all
the rows of second relation but here only selected rows of
a relation are made cross
product with second relation It is denoted as unde
If R and S are two relations then is the condition which is applied for select
operation on one relation and then only selected rows are cross product with all the
rows of second relation For Example there are two relations of FACULTY and
COURSE now we will first apply select operation on the FACULTY relation for
selection certain specific rows then these rows will have across product with
COURSE relation so this is the difference in between cross product and theta join
We will now see first both the relation their different attributes and then finally the
cross product after carrying out select operation on relation
From this example the difference in between cross product and theta join become
(i) Referential Integrity
Referential integrity refers to the accuracy and consistency of data within a relationship
In relationships data is linked between two or more tables This is achieved by having
the foreign key (in the associated table) reference a primary key value (in the primary ndash or
parent ndash table) Because of this we need to ensure that data on both sides of the relationship
remain intact
So referential integrity requires that whenever a foreign key value is used it must reference a
valid existing primary key in the parent table
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Example
For example if we delete record number 15 in a primary table we need to be sure that therersquos no
foreign key in any related table with the value of 15 We should only be able to delete a primary
key if there are no associated records Otherwise we would end up with an orphaned record
Here the related table contains a foreign key value that doesnrsquot exist in the primary key field of
the primary table (ie the ldquoCompanyIdrdquo field) This has resulted in an ldquoorphaned recordrdquo
So referential integrity will prevent users from
Adding records to a related table if there is no associated record in the primary table
Changing values in a primary table that result in orphaned records in a related table
Deleting records from a primary table if there are matching related records
Consequences of a Lack of Referential Integrity
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
A lack of referential integrity in a database can lead to incomplete data being returned usually
with no indication of an error This could result in records being ldquolostrdquo in the database because
theyrsquore never returned in queries or reports
It could also result in strange results appearing in reports (such as products without an associated
company)
Or worse yet it could result in customers not receiving products they paid for
Worse still it could affect life and death situations such as a hospital patient not receiving the
correct treatment or a disaster relief team not receiving the correct supplies or information
Data Integrity
Referential integrity is a subset of data integrity which is concerned with the accuracy and
consistency of all data (relationship or otherwise) Maintaining data integrity is a crucial part of
working with databases
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
(d)Explain following domain in details with example
AnsDefinition The domain of a database attribute is the set of all allowable values that
attribute may assume
Examples
A field for gender may have the domain male female unknown where those three values are
the only permitted entries in that column
In data management and database analysis a data domain refers to all the unique values which
a data element may contain The rule for determining the domain boundary may be as simple as
a data type with an enumerated list of values[1]
For example a database table that has information about people with one record per person
might have a gender column This gender column might be declared as a string data type and
allowed to have one of two known code values M for male F for femalemdashand NULL for
records where gender is unknown or not applicable (or arguably U for unknown as a sentinel
value) The data domain for the gender column is M F
In a normalized data model the reference domain is typically specified in a reference table
Following the previous example a Gender reference table would have exactly two records one
per allowed valuemdashexcluding NULL Reference tables are formally related to other tables in a
database by the use of foreign keys
Less simple domain boundary rules if database-enforced may be implemented through a check
constraint or in more complex cases in a database trigger For example a column requiring
positive numeric values may have a check constraint declaring that the values must be greater
than zero
This definition combines the concepts of domain as an area over which control is exercised and
the mathematical idea of a set of values of an independent variable for which a function is
defined
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
(ii) Degree and cardinality
The degree of relationship (also known as cardinality) is the number of occurrences in one
entity which are associated (or linked) to the number of occurrences in another
There are three degrees of relationship known as
1 one-to-one (11)
2 one-to-many (1M)
3 many-to-many (MN)
The latter one is correct it is MN and not MM
One-to-one (11)
This is where one occurrence of an entity relates to only one occurrence in another entityA one-
to-one relationship rarely exists in practice but it can However you may consider combining
them into one entity
For example an employee is allocated a company car which can only be driven by that
employee
Therefore there is a one-to-one relationship between employee and company car
One-to-Many (1M)
Is where one occurrence in an entity relates to many occurrences in another entityFor example
taking the employee and department entities shown on the previous page an employee works in
one department but a department has many employees
Therefore there is a one-to-many relationship between department and employee
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Many-to-Many (MN)
This is where many occurrences in an entity relate to many occurrences in another entity
The normalisation process discussed earlier would prevent any such relationships but the
definition is included here for completeness
As with one-to-one relationships many-to-many relationships rarely exist Normally they occur
because an entity has been missed
For example an employee may work on several projects at the same time and a project has a
team of many employees
Therefore there is a many-to-many relationship between employee and project
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Q4
EITHER
(a) Explain DBTG Data Manipulation
Ans The acronym DBTG refers to the Data Base Task Group of the Conference on
Data Systems Languages (CODASYL) the group responsible for standardization of the
programming language COBOL The DBTG final report appeared in Apri1971 it
introduced a new distinct and self-contained language The DBTG is intended to meet the
requirements of many distinct programming languages not just COBOL the user in a
DBTG system is considered to be an ordinary application programmer and the language
therefore is not biased toward any single specific programming language
(b) It is based on network model In addition to proposing a formal notation for networks (the
Data Definition Language or DDL) the DBTG has proposed a Subschema Data
Definition Language (Subschema DDL) for defining views of conceptual scheme that
was itself defined using the Data Definition Language It also proposed a Data
Manipulation Language (DML) suitable for writing applications programs that
manipulate the conceptual scheme or a view
(c) Architecture of DBTG Model
(d) The architecture of a DBTG system is illustrated in Figure
(e) The architecture of DBTG model can be divided in three different levels as the
architecture of a database system These are
(f) bull Storage Schema (corresponds to Internal View of database)
(g) bull Schema (corresponds to Conceptual View of database)
(h) bull Subschema (corresponds to External View of database)
(i) Storage Schema
(j) The storage structure (Internalmiddot View) of the database is described by the storage schema
written in a Data Storage Description Language (DSDL)
(k) Schema
(l) In DBTG the Conceptual View is defined by the schema The schema consists
essentially of definitions of the various type of record in the database the data-items they
contain and the sets into which they are grouped (Here logical record types aremiddot referred
to as record types the fields in a logical record format are called data items)
(m) Subschema
(n) The External view (not a DBTG term) is defined by a subschema A subschema consists
essentially of a specification of which schema record types the user is interested in which
schema data-items he or she wishes to see in those records and which schema
relationships (sets) linking those records he or she wishes to consider By default all
other types of record data-item and set are excluded
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
(o) In DBTG model the users are application programmers writing in an ordinary
programming language such as COBOL that has been extended to include the DBTG
data manipulation language Each application program invokes the corresponding
subschema using the COBOL Data Base Facility for example the programmer simply
specifies the name of the required subschema in the Data Division of the program This
invocation provides the definition of the user work area (UWA) for that program The
UWA contains a distinct location for each type of record (and hence for each type (data-
item) defined in the subschema The program may refer to these data-item and record
locations by the names defined in the subschema
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Q5
EITHER
(a) Define Normalization Explain first and second normal form Ans Normalization The process of decomposing unsatisfactory bad relations by
breaking up their attributes into smaller relations
Normalization is carried out in practice so that the resulting designs are of high quality
and meet the desirable properties
Normalization in industry pays particular attention to
normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
NF2 non-first normal form
1NF R is in 1NF iff all domain values are atomic2
2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the
key
3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent
on the key
Unnormalized Form (UNF)
A table that contains one or more repeating groups
To create an unnormalized table
transform data from information source (eg form) into table format with columns
and rows
First Normal Form (1NF)
A relation in which intersection of each row and column contains one and only one value
If a table of data meets the definition of a relation it is in first normal form
Every relation has a unique name
Every attribute value is atomic (single-valued)
Every row is unique
Attributes in tables have unique names
The order of the columns is irrelevant
The order of the rows is irrelevant
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s)
Remove repeating group by
entering appropriate data into the empty columns of rows containing repeating
data (lsquoflatteningrsquo the table)
Or by
placing repeating data along with copy of the original key attribute(s) into a
separate relation
Second Normal Form (2NF)
Based on concept of full functional dependency
A and B are attributes of a relation
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
B is fully dependent on A if B is functionally dependent on A but not on any
proper subset of A
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key
Second Normal Form (2NF)
1NF and no partial functional dependencies
Partial functional dependency when one or more non-key attributes are functionally
dependent on part of the primary key
Every non-key attribute must be defined by the entire key not just by part of the key
If a relation has a single attribute as its key then it is automatically in 2NF
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
1NF to 2NF
Identify primary key for the 1NF relation
Identify functional dependencies in the relation
If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant
Third Normal Form (3NF)
2NF and no transitive dependencies
Transitive dependency a functional dependency between two or more non-key attributes
Based on concept of transitive dependency
A B and C are attributes of a relation such that if A Ugrave B and B Ugrave C then C is
transitively dependent on A through B (Provided that A is not functionally
dependent on B or C)
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key
attribute is transitively dependent on the primary key
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
OR
(c)Explain multivalued dependency with suitable example
As normalization proceeds relations become progressively more restricted
(stronger) in format and also less vulnerable to update anomalies
Ans
1 NF2 non-first normal form
2 1NF R is in 1NF iff all domain values are atomic2
3 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on
the key
4 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively
dependent on the key
5 BCNF R is in BCNF iff every determinant is a candidate key
6 Determinant an attribute on which some other attribute is fully functionally
dependent
Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies
of attribute sets on something other than a superset of a candidate key A table is said to be in
4NF if and only if it is in the BCNF and multi-valued dependencies are functional
dependencies The 4NF removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it
uses Multivalued dependencies
(d) What are inference axioms Explain its significance in Relational
Database Design
Ans Inference Axioms (A-axioms or Armstrongrsquos Axioms)
An inference axiom is a rule that states if a relation satisfies certain FDs then it must satisfy
certain other FDs
F1 Reflexivity X X
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
F2 Augmentation If (Z W X Y) then XW YZ
F3 Additivity If (X Y) (X Z) then X YZ
F4 Projectivity If (X YZ) then X Y
F5 Transitivity If (X Y) and (Y Z) then (X Z)
F6 Pseudotransitivity If (X Y) and (YZ W) then XZ W
Examples of the use of Inference Axioms
[From Ullman]
1 Consider R = (Street Zip City) F = City Street Zip Zip City
We want to show Street Zip Street Zip City
Proof
1 Zip City ndash Given
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
2 Street Zip Street City ndash Augmentation of (1) by Street
3 City Zip Zip ndash Given
4 City Street City Street Zip ndash Augmentation of (3) by City Street
5 Street Zip City Street Zip ndash Transitivity
[From Maier]
1 Let R = (ABCDEGHI) F = AB E AG J BE I E G GI H
Show that AB GH is derived by F
1 AB E - Given
2 AB AB ndash Reflexivity
3 AB B - Projectivity from (2)
4 AB BE ndash Additivity from (1) and (3)
5 BE I - Given
6 AB I ndash Transitivity from (4) and (5)
7 E G ndash Given
8 AB G ndash Transitivity from (1) and (7)
9 AB GI ndash Additivity from (6) and (8)
10 GI H ndash Given
11 AB H ndash Transitivity from (9) and (10)
12 AB GH ndash Additivity from (8) and (11)
Significance in Relational Database design A database structure commonly used in GIS in
which data is stored based on 2 dimensional tables where multiple relationships between data
elements can be defined and established in an ad-hoc manner elational Database Management
System - a database system made up of files with data elements in two-dimensional array (rows
and columns) This database management system has the capability to recombine data elements
to form different relations resulting in a great flexibility of data usage
A database that is perceived by the user as a collection of twodimensional tables
bull Are manipulated a set at a time rather than a record at a time
bull SQL is used to manipulate relational databases Proposed by Dr Codd in 1970
bull The basis for the relational database management system (RDBMS)
bull The relational model contains the following components
bull Collection of objects or relations
bull Set of operations to act on the relations
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Q5
EITHER
(a) What is deadlock How can it be avoided How can it be
resolved once it occurs Ans A deadlock occurs when two different users or transactions require access to data that
is being locked by the other user It can be avoided in 2 ways 1 is to set measures which
prevent deadlocks from happening and 2 is to set ways in which to break the deadlock
after it happens One way to prevent or to avoid deadlocks is to require the user to request
all necessary locks atone time ensuring they gain access to everything they need or
nothing Secondly sometimes they can be avoided by setting resource access order
meaning resources must be locked in a certain order to prevent such instances Essentially
once a deadlock does occur the DBMS must have a method for detecting the deadlock
and then to resolve it the DBMS must select a transaction to cancel and revert the entire
transaction until the resources required become available allowing one transaction to
complete while the other has to be reprocessed at a later time 921 Explain the meaning
of the expression ACID transaction
CID means Atomic Consistency Isolation Durability so when any transaction happen it
should be Atomic that is it should either be complete or fully incomplete There should n
ot be anything like Semi complete The Database State should remain consistent after the
completion of the transaction If there are more than one Transaction then the transaction
should be scheduled in such a fashion that they remain in Isolation of one another Durabi
lity means that Once a transaction commits its effects will persist even if there are syste
m failures924 What is the purpose of transaction isolation levelsTransaction isolation
levels effect how the database is to operate while transactions are in process of being
changed Itrsquos purpose is to ensure consistency throughout the database for example if I
am changing a row which effects the calculations or outputs of several other rows then
all rows that are effected or possibly effected by a change in the row Irsquom working on will
be locked from changes until I am complete with my change This isolates the change and
ensures that the data interaction remains accurate and consistent and is known as
transaction level consistencyThe transaction being changed which may effect serveral
other pieces of data or rows of input could also effect how those rows are read So lets
say Irsquom processing a change to the tax rate inmy state so my store clerk shouldnrsquot be able
to read the total cost of a blue shirt because the total cost row is effected by any changes in
the tax rate row Essentially how you deal with the reading and viewing of data while a
change is being processed but hasnrsquot been committed is known as the transaction
isolation level Itrsquos purpose is to ensure that no one is misinformed prior to a transaction
be committed
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
(b) Explain concurrency control and database recovery in detail
Ans In a multiprogramming environment where multiple transactions can be executed
simultaneously it is highly important to control the concurrency of transactions We have
concurrency control protocols to ensure atomicity isolation and serializability of concurrent
transactions Concurrency control protocols can be broadly divided into two categories minus
Lock based protocols
Time stamp based protocols
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any
transaction cannot read or write data until it acquires an appropriate lock on it Locks are of two
kinds minus
Binary Locks minus A lock on a data item can be in two states it is either locked or
unlocked
Sharedexclusive minus This type of locking mechanism differentiates the locks based on
their uses If a lock is acquired on a data item to perform a write operation it is an
exclusive lock Allowing more than one transaction to write on the same data item
would lead the database into an inconsistent state Read locks are shared because no data
value is being changed
There are four types of lock protocols available minus
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a
write operation is performed Transactions may unlock the data item after completing the
lsquowritersquo operation
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they
need locks Before initiating an execution the transaction requests the system for all the locks it
needs beforehand If all the locks are granted the transaction executes and releases all the locks
when all its operations are over If all the locks are not granted the transaction rolls back and
waits until all the locks are granted
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts In the first
part when the transaction starts executing it seeks permission for the locks it requires The
second part is where the transaction acquires all the locks As soon as the transaction releases its
first lock the third phase starts In this phase the transaction cannot demand any new locks it
only releases the acquired locks
Two-phase locking has two phases one is growing where all the locks are being acquired by
the transaction and the second phase is shrinking where the locks held by the transaction are
being released
To claim an exclusive (write) lock a transaction must first acquire a shared (read) lock and then
upgrade it to an exclusive lock
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL After acquiring all the locks in the first phase the
transaction continues to execute normally But in contrast to 2PL Strict-2PL does not release a
lock after using it Strict-2PL holds all the locks until the commit point and releases all the locks
at a time
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Strict-2PL does not have cascading abort as 2PL does
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based protocol This protocol
uses either system time or logical counter as a timestamp
Lock-based protocols manage the order between the conflicting pairs among transactions at the
time of execution whereas timestamp-based protocols start working as soon as a transaction is
created
Every transaction has a timestamp associated with it and the ordering is determined by the age
of the transaction A transaction created at 0002 clock time would be older than all other
transactions that come after it For example any transaction y entering the system at 0004 is
two seconds younger and the priority would be given to the older one
In addition every data item is given the latest read and write-timestamp This lets the system
know when the last lsquoread and writersquo operation was performed on the data item
OR
(b) Explain database security mechanisms8
Database security covers and enforces security on all aspects and components of databases This
includes
Data stored in database
Database server
Database management system (DBMS)
Other database workflow applications
Database security is generally planned implemented and maintained by a database administrator
and or other information security professional
Some of the ways database security is analyzed and implemented include
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Restricting unauthorized access and use by implementing strong and multifactor access
and data management controls
Loadstress testing and capacity testing of a database to ensure it does not crash in a
distributed denial of service (DDoS) attack or user overload
Physical security of the database server and backup equipment from theft and natural
disasters
Reviewing existing system for any known or unknown vulnerabilities and defining and
implementing a road mapplan to mitigate them
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
(d)Explain knowledge based database system in detail
Ans
The term knowledge-base was coined to distinguish this form of knowledge store from the
more common and widely used term database At the time (the 1970s) virtually all
large Management Information Systems stored their data in some type of hierarchical or
relational database At this point in the history of Information Technology the distinction
between a database and a knowledge base was clear and unambiguous
A database had the following properties
Flat data Data was usually represented in a tabular format with strings or numbers in each
field
Multiple users A conventional database needed to support more than one user or system
logged into the same data at the same time
Transactions An essential requirement for a database was to maintain integrity and
consistency among data accessed by concurrent users These are the so-
called ACID properties Atomicity Consistency Isolation and Durability
Large long-lived data A corporate database needed to support not just thousands but
hundreds of thousands or more rows of data Such a database usually needed to persist past
the specific uses of any individual program it needed to store data for years and decades
rather than for the life of a program
The first knowledge-based systems had data needs that were the opposite of these database
requirements An expert system requires structured data Not just tables with numbers and
strings but pointers to other objects that in turn have additional pointers The ideal representation
for a knowledge base is an object model (often called an ontology in artificial
intelligence literature) with classes subclasses and instances
Early expert systems also had little need for multiple users or the complexity that comes with
requiring transactional properties on data The data for the early expert systems was used to
arrive at a specific answer such as a medical diagnosis the design of a molecule or a response
to an emergency[1] Once the solution to the problem was known there was not a critical demand
to store large amounts of data back to a permanent memory store A more precise statement
would be that given the technologies available researchers compromised and did without these
capabilities because they realized they were beyond what could be expected and they could
develop useful solutions to non-trivial problems without them Even from the beginning the
more astute researchers realized the potential benefits of being able to store analyze and reuse
knowledge For example see the discussion of Corporate Memory in the earliest work of the
Knowledge-Based Software Assistant program by Cordell Green et al[2]
The volume requirements were also different for a knowledge-base compared to a conventional
database The knowledge-base needed to know facts about the world For example to represent
the statement that All humans are mortal A database typically could not represent this general
knowledge but instead would need to store information about thousands of tables that
represented information about specific humans Representing that all humans are mortal and
being able to reason about any given human that they are mortal is the work of a knowledge-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
base Representing that George Mary Sam Jenna Mike and hundreds of thousands of other
customers are all humans with specific ages sex address etc is the work for a database[3][4]
As expert systems moved from being prototypes to systems deployed in corporate environments
the requirements for their data storage rapidly started to overlap with the standard database
requirements for multiple distributed users with support for transactions Initially the demand
could be seen in two different but competitive markets From the AI and Object-Oriented
communities object-oriented databases such as Versant emerged These were systems designed
from the ground up to have support for object-oriented capabilities but also to support standard
database services as well On the other hand the large database vendors such as Oracleadded
capabilities to their products that provided support for knowledge-base requirements such as
class-subclass relations and rules
Internet as a knowledge base[edit]
The next evolution for the term knowledge-base was the Internet With the rise of the Internet
documents hypertext and multimedia support were now critical for any corporate database It
was no longer enough to support large tables of data or relatively small objects that lived
primarily in computer memory Support for corporate web sites required persistence and
transactions for documents This created a whole new discipline known as Web Content
Management The other driver for document support was the rise of knowledge
management vendors such as Lotus Notes Knowledge Management actually predated the
Internet but with the Internet there was great synergy between the two areas Knowledge
management products adopted the term knowledge-base to describe their repositories but the
meaning had a subtle difference In the case of previous knowledge-based systems the
knowledge was primarily for the use of an automated system to reason about and draw
conclusions about the world With knowledge management products the knowledge was
primarily meant for humans for example to serve as a repository of manuals procedures
policies best practices reusable designs and code etc In both cases the distinctions between the
uses and kinds of systems were ill-defined As the technology scaled up it was rare to find a
system that could really be cleanly classified as knowledge-based in the sense of an expert
system that performed automated reasoning and knowledge-based in the sense of knowledge
management that provided knowledge in the form of documents and media that could be
leveraged by us humans
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Tulsiramji Gaikwad-Patil College of Engineering amp Technology
Department of MCA
Question paper Solution
Summer-17
Academic Session 2018 ndash 2019
Subject DBMS
MCA-1st year (Sem II)
QUE 1-
(A) Explain the following in the detail
(i) Concurrency control
AnsConcurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another Concurrent access is quite easy if all
users are just reading data There is no way they can interfere with one another Though for any practical database would have a mix of reading and WRITE operations and
hence the concurrency is a challenge
Concurrency control is used to address such conflicts which mostly occur with a multi-
user system It helps you to make sure that database transactions are performed
concurrently without violating the data integrity of respective databases
Therefore concurrency control is a most important element for the proper functioning of a system where two or multiple database transactions that require access to the same data
are executed simultaneously
(ii) Atomicity property
In database systems atomicity (ˌaeligtəˈmɪsəti from Ancient Greek ἄτομος translit aacutetomos lit undividable) is one of
the ACID (Atomicity Consistency Isolation Durability) transaction properties An atomic
transaction is an indivisible and irreducible series of database operations such that either all occur or nothing occurs[1] A guarantee of atomicity prevents updates to the database
occurring only partially which can cause greater problems than rejecting the whole series
outright As a consequence the transaction cannot be observed to be in progress by another
database client At one moment in time it has not yet happened and at the next it has already
occurred in whole (or nothing happened if the transaction was cancelled in progress)
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
An example of an atomic transaction is a monetary transfer from bank account A to account B It consists of two operations withdrawing the money from account A and saving it to account B
Performing these operations in an atomic transaction ensures that the database remains in a consistent
state that is money is neither lost nor created if either of those two operations fai
(B) Give the level architecture proposal for DBMS
Ans Objective of three level architecture proposal for DBMS
All users should be able to access same data
A users view is immune to changes made in other views
Users should not need to know physical database storage details
DBA should be able to change database storage structures without affecting the users views
Internal structure of database should be unaffected by changes to physical aspects of storage
DBA should be able to change conceptual structure of database without affecting all users
The architecture of a database management system can be broadly divided into three levels
a External level
b Conceptual level
c Internal level
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Above three points are explain in detail given bellow-
External Level
This is the highest level one that is closest to the user It is also called the user view The user
view is different from the way data is stored in the database This view describes only a part of
the actual database Because each user is not concerned with the entire database only the part that
is relevant to the user is visible For example end users and application programmers get
different external views
Each user uses a language to carry out database operations The application programmer
uses either a conventional third-generation language such as COBOL or C or a fourth-generation
language specific to the DBMS such as visual FoxPro or MS Access
The end user uses a query language to access data from the database A query language is a
combination of three subordinate language
Data Definition Language (DDL)
Data Manipulation Language (DML)
Data Control Language (DCL)
The data definition language defines and declares the database object while the data
manipulation language performs operations on these objects The data control language is used to
control the userrsquos access to database objects
Conceptual Level - This level comes between the external and the internal levels The
conceptual level represents the entire database as a whole and is used by the DBA This level is
the view of the data ldquoas it really isrdquo The userrsquos view of the data is constrained by the language
that they are using At the conceptual level the data is viewed without any of these constraints
Internal Level - This level deals with the physical storage of data and is the lowest level of
the architecture The internal level describes the physical sequence of the stored records
So that objective of three level of architecture proposal for DBMS are suitable explain in
above
(C) Describe the structure of DBMS
Ans DBMS (Database Management System) acts as an interface between the user and the
database The user requests the DBMS to perform various operations (insert delete update and
retrieval) on the database The components of DBMS perform these requested operations on the
database and provide necessary data to the users
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Fig Structure of Database Management System
Components of DBMS -
DDL Compiler
Data Manager
File Manager
Disk Manager
Query Processor
Telecommunication System
Data Files
Data Dictionary
Access Aids
1 DDL Compiler - Data Description Language compiler processes schema definitions specified
in the DDL It includes metadata information such as the name of the files data items storage
details of each file mapping information and constraints etc
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
2 DML Compiler and Query optimizer - The DML commands such as insert update delete
retrieve from the application program are sent to the DML compiler for compilation into object
code for database access The object code is then optimized in the best way to execute a query by
the query optimizer and then send to the data manager
3 Data Manager - The Data Manager is the central software component of the DBMS also knows
as Database Control System
The Main Functions Of Data Manager Are ndash
Convert operations in users Queries coming from the application programs or combination of
DML Compiler and Query optimizer which is known as Query Processor from users logical view
to physical file system
Controls DBMS information access that is stored on disk
It also controls handling buffers in main memory
It also enforces constraints to maintain consistency and integrity of the data
It also synchronizes the simultaneous operations performed by the concurrent users
It also controls the backup and recovery operations
4 Data Dictionary - Data Dictionary is a repository of description of data in the database It
contains information about
1 Data - names of the tables names of attributes of each table length of attributes and number of rows in each table
2 Relationships between database transactions and data items referenced by them
which is useful in determining which transactions are affected when certain data definitions are changed
3 Constraints on data ie range of values permitted
4 Detailed information on physical database design such as storage structure
access paths files and record sizes 5 Access Authorization - is the Description of database users their responsibilities
and their access rights
6 Usage statistics such as frequency of query and transactions 7 Data dictionary is used to actually control the data integrity database operation
and accuracy It may be used as a important part of the DBMS
8 Importance of Data Dictionary -
9 Data Dictionary is necessary in the databases due to following reasons 10 It improves the control of DBA over the information system and users
understanding of use of the system
11 bull It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions
5 Data Files - It contains the data portion of the database
6 Compiled DML - The DML complier converts the high level Queries into low level file access
commands known as compiled DML
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
7 End Users The users of the database system can be classified in the following groups
depending on their degree of expertise or the mode of their interactions with the DBMS
1 Naiumlve users
2 Online Users
3 Application Programmers
4 Database administrator
i) Naiumlve User Naive users who need not have aware of the present of the database system or any other system A user of an automatic teller falls under this category The user is instructed through each step of a transaction he or she responds by pressing a coded key or entering a numeric value The operations that can be performed by this calls of users are very limited and affect a precise portion of the database in case of the user of the automatic teller machine only one or more of her or his own accounts Other such naive users are where the type and range of response is always indicated to the user Thus a very competent database designer could be allowed to use a particular database system only as a naive user
ii) Online users There are users who may communicate with the database directly via an online terminal or indirectly via a user interface and application program These users are aware of the presence of the database system and may have acquired a certain amount of expertise in the limited interaction they are permitted with the database through the intermediate application program The more sophisticated of these users may also use a data manipulation language to manipulate the database directly On-line users can also be naive users requiring help such as menus
iii) Application Users Professional programmers who are responsible for developing application programs or user interfaces utilized by the naive and online users fall into this category The application programs could be written in a general purpose programming language such as Assembler C COBOL FORTRAN PASCAL or PLI and include the commands required to manipulate the database
iv) Database Administrator Centralized control of the database is exerted by a person or group of persons under the supervision of a high level administrator This person or group is referred to as the database administrator (DBA) They are users who are the most familiar with the database and are responsible for creating modifying and maintaining its three levels
The DBA us the custodian of the data and controls the database structure The DBA administers the three levels of the database and in consultation with the overall user community sets up the definition of the global view or conceptual level of the database The DBA further specifies the external view of the various users and applications and is responsible for definition and implementation of the internal level including the storage structure and access methods to be used for the optimum performance of the DBMS
(D) What are the advantage o f using a DBMS over the conventional
fole processing system
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Ans A database is a collection of non-redundant data which can be shared by different application
systems stresses the importance of multiple applications data sharing the spatial database
becomes a common resource for an agency implies separation of physical storage from use of the
data by an application program ie programdata independence the user or programmer or
application specialist need not know the details of how the data are stored such details are
transparent to the user changes can be made to data without affecting other components of the
system eg change format of data items (real to integer arithmetic operations) change file
structure (reorganize data internally or change mode of access) relocate from one device to
another eg from optical to magnetic storage from tape to disk
Advantages
1 Control of data redundancy
2 Data consistency
3 More information from the same amount of data 4 Sharing of data
5 Improved data integrity
6 Improved security 7 Enforcement of standards 8 Economy of scale
1 Controlling Data Redundancy - In the conventional file processing system
Every user group maintains its own files for handling its data files This may lead to
bull Duplication of same data in different files
bull Wastage of storage space since duplicated data is stored
bull Errors may be generated due to pupation of the same data in different files
bull Time in entering data again and again is wasted
bull Computer Resources are needlessly used
bull It is very difficult to combine information
2 Elimination of Inconsistency - In the file processing system information is duplicated
throughout the system So changes made in one file may be necessary be carried over to
another file This may lead to inconsistent data So we need to remove this duplication of
data in multiple file to eliminate inconsistency
3 Better service to the users - A DBMS is often used to provide better services to the users In
conventional system availability of information is often poor since it normally difficult to
obtain information that the existing systems were not designed for Once several conventional
systems are combined to form one centralized database the availability of information and its
update ness is likely to improve since the data can now be shared and DBMS makes it easy to
respond to anticipated information requests
Centralizing the data in the database also means that user can obtain new and combined
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
information easily that would have been impossible to obtain otherwise Also use of DBMS
should allow users that dont know programming to interact with the data more easily unlike
file processing system where the programmer may need to write new programs to meet every
new demand
4 Flexibility of the System is improved - Since changes are often necessary to the contents of
the data stored in any system these changes are made more easily in a centralized database
than in a conventional system Applications programs need not to be changed on changing the
data in the database
5 Integrity can be improved - Since data of the organization using database approach is
centralized and would be used by a number of users at a time It is essential to enforce
integrity-constraints
In the conventional systems because the data is duplicated in multiple files so updating or
changes may sometimes lead to entry of incorrect data in some files where it exists
6 Standards can be enforced - Since all access to the database must be through DBMS so
standards are easier to enforce Standards may relate to the naming of data format of data
structure of the data etc Standardizing stored data formats is usually desirable for the purpose
of data interchange or migration between systems
7 Security can be improved - In conventional systems applications are developed in an
adhoctemporary manner Often different system of an organization would access different
components of the operational data in such an environment enforcing security can be quiet
difficult Setting up of a database makes it easier to enforce security restrictions since data is
now centralized It is easier to control who has access to what parts of the database Different
checks can be established for each type of access (retrieve modify delete etc) to each piece
of information in the database
8 Organizations requirement can be identified - All organizations have sections and
departments and each of these units often consider the work of their unit as the most
important and therefore consider their need as the most important Once a database has been
setup with centralized control it will be necessary to identify organizations requirement and
to balance the needs of the competating units So it may become necessary to ignore some
requests for information if they conflict with higher priority need of the organization
It is the responsibility of the DBA (Database Administrator) to structure the database system
to provide the overall service that is best for an organization
9 Overall cost of developing and maintaining systems is lower - It is much easier to respond to
unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system Although the initial cost of setting up of a database can be large
one normal expects the overall cost of setting up of a database developing and maintaining
application programs to be far lower than for similar service using conventional systems
Since the productivity of programmers can be higher in using non-procedural languages that
have been developed with DBMS than using procedural languages
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
10 Data Model must be developed - Perhaps the most important advantage of setting up of
database system is the requirement that an overall data model for an organization be build In
conventional systems it is more likely that files will be designed as per need of particular
applications demand The overall view is often not considered Building an overall view of an
organizations data is usual cost effective in the long terms
11 Provides backup and Recovery - Centralizing a database provides the schemes such as
recovery and backups from the failures including disk crash power failures software errors
which may help the database to recover from the inconsistent state to the state that existed
prior to the occurrence of the failure though methods are very complex
QUE2- EITHER
(A) Explain ER model with suitable example
Ans It is a ldquotop-downrdquo approach
This data model allows us to describe how data is used in a real-world enterprise an
iterative process A team-oriented process with all business managers (or designates)
involved should validate with a ldquobottom-uprdquo approach Has three primary components entity
relationship attributes
Many notation methods Chen was the first to become established
The building blocks of E-R model are entities relationships and attributes
Entity An entity may be defined as a thing which is recognized as being capable of an independent
existence and which can be uniquely identified An entity is an abstraction from the complexities of some
domain When we speak of an entity we normally speak of some aspect of the real world which can be
distinguished from other aspects of the real world An entity may be a physical object such as a house or a car an event such as a house sale or a car service or a concept such as a customer transaction or order
An entity-type is a category An entity strictly speaking is an instance of a given entity-type There are
usually many instances of an entity-type Because the term entity-type is somewhat cumbersome most
people tend to use the term entity as a synonym for this term
Attributes It is a Characteristic of an entity Studentrsquos (entity) attributes student ID student name
address etc
Attributes are of various types
SimpleSingle Attributes
Composite Attributes
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Multivalued attributes
Derived attributes
Relationship Relationship captures how two or more entities are related to one another Relationships can
be thought of as verbs linking two or more nouns Examples an owns relationship between a company and a computer a supervises relationship between an employee and a department a performs relationship
between an artist and a song a proved relationship between a mathematician and a theorem Relationships
are represented as diamonds connected by lines to each of the entities in the relationship Types of
relationships are as follows
One to many 1lt------- M Many to one M------1
Many to many M------M
Symbols and their meanings
Rectangles represent entity sets
Diamonds represent relationship sets
Lines link attributes to entity sets and entity sets to relationship sets
Ellipses represent attributes
Double ellipses represent Multivalued attributes
Dashed ellipses denote derived attributes
Underline indicates primary key attributes
Example
Given Entity Customer with attributes customer_id(primary key) name( first_name last_name
middle_name) phone_number date_of_birth address(citystatezip_codestreet)
Street(Street_namestreet_numberapartment_number)
--------------------------------------------------------------------------------------------------------
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
(c)Illustrate the construction of secondrery key retrieval with a suitable example
Ans In sequential File Index Sequential file and Direct File we have considered the retrieval and
update of data based on primary key
(i)We can retrieve and update data based on secondary key called as secondary key retrieval
(ii)In secondary key retrieval there are multiple records satisfying a given key value
(iii)For eg if we search a student file based on the attribute ldquostud_namerdquo we can get the set of
records which satisfy the given value
(D)Define the following terms -
(i) Specialization
(ii) Association
(iii) Relationship
(iv) Aggregation QUE 3-EITHER
(A) Let R(ABC) and Let r1 and r2 both be relations on schema R given the equivalent QBE
expression for each of the following queries -
(i) Y1 u y2
(ii) Y1 u y2
(iii) R1-r2
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
QUE4- EITHER
(A) What is join dependency Discuss 5NF
Ans Join Dependencies (JD)
A join dependency can be described as follows
1 If a table can be decomposed into three or more smaller tables it must be capable of being joined
again on common keys to form the original table
A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot
have a lossless decomposition into any number of smaller tables
Another way of expressing this is and each join dependency is a consequence of the candidate keys
It can also be expressed as there are no pair wise cyclical dependencies in the primary key
comprised of three or more attributes
Anomalies can occur in relations in 4NF if the primary key has three or more fields
5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF
Pair wise cyclical dependency means that
You always need to know two values (pair wise)
For any one you must know the other two (cyclical)
Example Buying(buyer vendor item)
This is used to track buyers what they buy and from whom they buy
Take the following sample data
buyer vendor Item
Sally Liz Claiborne Blouses
Mary Liz Claiborne Blouses
Sally Jordach Jeans
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Mary Jordach Jeans
Sally Jordach Sneakers
The question is what do you do if Claiborne starts to sell Jeans How many records must you create to
record this fact
The problem is there are pairwise cyclical dependencies in the primary key That is in order to determine
the item you must know the buyer and vendor and to determine the vendor you must know the buyer and
the item and finally to know the buyer you must know the vendor and the item The solution is to break
this one table into three tables Buyer-Vendor Buyer-Item and Vendor-Item
(B) Explain the architecture of an IMS System
Ans Information Management system (IMS) is an IBM program product that is designed to support
both batch and online application programs
Host Language
+
DLI
Host Language
+
DLI
PCB PCB
DBD DBD DBD DBD DBD DBD hellip
IMS
Control
program
PCB PCB
PSB-B PSB-A
Application A Application B
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Conceptual View
The conceptual view consists of collection of physical database The ldquophysicalrdquo is somewhat
misleading in this context since the user does not see such a database exactly as it is stored indeed
IMS provides a fairely high degree of insulation of the user from the storage structure Each physical
database is defined by a database description (DBD) The mapping of the physical database to storage
is also DBDrsquos corresponds to the conceptual schema plus the associated conceptualinternal mapping
definition
DBD (Database Description) Each physical databse is defined together with its mapping to
storage by a databse description (DBD) The source form of the DBD is written using a special
System370 Assembler Language macro statements Once written the DBD is assembled and the
object form is stored in a system library from which it may be extracted when required by the IMS
control program
All names of DBDrsquos in IMS are limited to a maximum length of eight characters
Example
1 DBD NAMEEDUCPDBD
2 SEGM NAME=COURSEBYTES=256
3 FILED NAME=(COURSESEQ)BYTES=3START=1 4 FIELD NAME=TITLE BYTES=33START=4
5 FIELD NAME=DESCRIPNBYETS=220START=37
6 SEGM NAME=PREREQPARENT=COURSEBYTES=36 7 FILED NAME=(COURSESEQ)BYTES=3START=1
8 FIELD NAME=TITLE BYTES=33START=4
9 SEGM NAME=OFFERINGPARENT=COURSEBYTES=20 10 FILED NAME=(DATESEQM)BYTES=6START=1
11 FIELD NAME=LOCATION BYTES=12START=7
12 FIELD NAME=FORMATBYETS=2START=19 13 SEGM NAME=TEACHERPARENT=OFFERINGBYTES=24
14 FIELD NAME=(EMPSEQ) BYTES=6START=1
15 FIELD NAME=NAMEBYETS=18START=7
16 SEGM NAME=STUDENTPARENT=OFFERINGBYTES=25
17 FILED NAME=(EMPSEQ)BYTES=6START=1
18 FIELD NAME=NAME BYTES=18START=7
19 FIELD NAME=GRADEBYTES=1START=25
External View
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
The user does not operate directly at the physical database level but rather on an ldquoexternal viewrdquo of
the data A particular userrsquos external view consists of a collection of ldquological databasesrdquo where each
logical database is a subset of the corresponding physical database Each logical database is defined
by means of a program communication block (PCB) The set of all PCBrsquos for one user corresponding
to the external schema plus the associated mapping definition is called program specification block
(PSB)
PCB Program Communication BLOCK Each logical Database is defined by a program
communication block (PCB) The PCB includes a specification of the mapping between the LDB and
the corresponding PDB
PSB Program Specification BLOCK The set of all PCBrsquos for a given user forms that userrsquos
program specification block (PSB)
Example
1 PCB TYPE=DBDBNAME=EDUCPDBDKEYLEN=15
2 SENSEG NAME=COURSEPROCOPT=G 3 SENSEG NAME=OFFERINGPARENT=COURSEPROCOPT=G
4 SENSEG NAME=STUDENTPARENT=OFFERINGPROCOPT=G
PROCOPT The PROCOPT entry specifies the types of operation that the user will be permitting to
perform on this segment In this example the entry is G (ldquogetrdquo) indicating retrieval only Other
possible values are I(ldquoinsertrdquo) R(ldquoreplacerdquo) and D(ldquodeleterdquo)
Internal View
The users are ordinary application programmers using a host language from which the IMS data
manipulation language DLI- ldquoData LanguageIrdquo- may be invoked by subroutine call End-users are
supported via user-written on-line application programs IMS does not provide an integrated query
language
OR
(C) Explain the following -
(i) Functional dependency
Functional Dependency The value of one attribute (the determinant)
determines the value of another attribute
Candidate Key A possible key
Each non-key field is functionally dependent on every candidate key
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
No attribute in the key can be deleted without destroying the property of
unique identification
Main characteristics of functional dependencies used in
normalization
have a 11 relationship between attribute(s) on left and right-hand side of
a dependency hold for all time are nontrivial
Complete set of functional dependencies for a given relation can be very
large
Important to find an approach that can reduce set to a manageable size
Need to identify set of functional dependencies (X) for a relation that is
smaller than complete set of functional dependencies (Y) for that relation
and has property that every functional dependency in Y is implied by
functional dependencies in X
(D) Explain 4 NF with examples
Ans Normalization The process of decomposing unsatisfactory ldquobadrdquo relations by breaking up
their attributes into smaller relationsThe normal form of a relation refers to the highest normal form
condition that a relation meets and indicates the degree to which it has been normalized
Normalization is carried out in practice so that the resulting designs are of high quality and meet the
desirable properties
Normalization in industry pays particular attention to normalization up to 3NF BCNF or 4NF
We will pay particular attention up to 3NF
The database designers need not normalize to the highest possible normal form
Formal technique for analyzing a relation based on its primary key and functional dependencies
between its attributes
Often executed as a series of steps Each step corresponds to a specific normal form which has
known properties
As normalization proceeds relations become progressively more restricted (stronger) in format and
also less vulnerable to update anomalies
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
7 NF2 non-first normal form 8 1NF R is in 1NF iff all domain values are atomic2
9 2NF R is in 2 NF iff R is in 1NF and every nonkey attribute is fully dependent on the key
10 3NF R is in 3NF iff R is 2NF and every nonkey attribute is non-transitively dependent on the
key 11 BCNF R is in BCNF iff every determinant is a candidate key
12 Determinant an attribute on which some other attribute is fully functionally dependent Fourth Normal Form
Fourth normal form (or 4NF) requires that there are no non-trivial multi-valued dependencies of
attribute sets on something other than a superset of a candidate key A table is said to be in 4NF if and
only if it is in the BCNF and multi-valued dependencies are functional dependencies The 4NF
removes unwanted data structures multi-valued dependencies
There is no Multivalued dependency in the relation
There are Multivalued dependency but the attributes are dependent between themselves
Either of these conditions must hold true in order to be fourth normal form
The relation must also be in BCNF Fourth normal form differs from BCNF only in that it uses
Multivalued dependencies
Q5
Either
(A) What are object oriented database systems What are its features
Ans Object databases are a niche field within the broader DBMS market dominated by relational
database management systems (RDBMS) Object databases have been considered since the early 1980s
and 1990s but they have made little impact on mainstream commercial data proc
Features of object oriented database systems
Most object databases also offer some kind of query language allowing objects to be found by a more declarative programming approach It is in the area of object query languages and the integration of the
query and navigational interfaces that the biggest differences between products are found An attempt at
standardization was made by the ODMG with the Object Query Language OQL
Access to data can be faster because joins are often not needed (as in a tabular implementation of a relational database) This is because an object can be retrieved directly without a search by following
pointers (It could however be argued that joining is a higher-level abstraction of pointer following)
Another area of variation between products is in the way that the schema of a database is defined A
general characteristic however is that the programming language and the database schema use the same
type definitions
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Multimedia applications are facilitated because the class methods associated with the data are responsible
for its correct interpretation
Many object databases for example VOSS offer support for versioning An object can be viewed as the
set of all its versions Also object versions can be treated as objects in their own right Some object
databases also provide systematic support for triggers and constraints which are the basis of active
databases
The efficiency of such a database is also greatly improved in areas which demand massive amounts of
data about one item For example a banking institution could get the users account information and
provide them efficiently with extensive information such as transactions account information entries etc
C) How database recovery it done Discuss its different types
Ans SQL Server database recovery models give you backup-and-restore flexibility The model used will determine how much time and space your backups will take and how great your risk of data loss will
be when a breakdown occurs
System breakdowns happen all the time even to the best configured systems This is why you have to
explore the options available in order to prepare for the worst
SQL server database recovery can be easier achieved if you are running on at least the SQL server 2000
It has a built in feature known as the database recovery model that controls the following
Both the speed and size of your transaction log backups The degree to which you might be at risk of losing committed transactions in the event of
media failure
Models
There are three types of database recovery models available
Full Recovery Bulk Logged Recovery
Simple Recovery
Full Recovery
This is your best guarantee for full data recovery The SQL Server fully logs all operations so every row
inserted through a bulk copy program (bcp) or BULK INSERT operation is written in its entirety to the
transaction log When data files are lost because of media failure the transaction log can be backed up
Database restoration up to any specified time can be achieved after media failure for a database
file has occurred If your log file is available after the failure you can restore up to the last
transaction committed Log Marks feature allows you to place reference points in the transaction log that allow you to
recover a log mark
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
Logs CREATE INDEX operations Recovery from a transaction log backup that includes index
creations is done at a faster pace because the index does not have to be rebuilt
Bulk Logged Recovery Model
This model allows for recovery in case of media failure and gives you the best performance using the
least log space for certain bulk operations including BULK INSERT bcp CREATE INDEX
WRITETEXT and UPDATETEXT
Simple Recovery Model It allows for the fastest bulk operations and the simplest backup-and-restore strategy Under this model
SQL Server truncates the transaction log at regular intervals removing committed transactions
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
(d)Describe Deadlocks a Distributed System
Ans
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
- Components of DBMS
- DBMS Three Level Architecture Diagram
- 1 External level
- 2 Conceptual level
- 3 Internal level
- 1 Logical Data Independence
- 2 Physical Data Independence
- Sequential File Organization
- Example
- Consequences of a Lack of Referential Integrity
- One-to-one (11)
- One-to-Many (1M)
- Many-to-Many (MN)
- Lock-based Protocols
-
- Simplistic Lock Protocol
- Pre-claiming Lock Protocol
- Two-Phase Locking 2PL
- Strict Two-Phase Locking
-
- Timestamp-based Protocols
- Internet as a knowledge base[edit]
-
- Ans Join Dependencies (JD)
-
top related