dhanalakshmi college of engineering … · unit i introduction to dbms 1. ... (m/j-07) the...

19
DHANALAKSHMI COLLEGE OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CS6302 DATABASE MANAGEMENT SYSTEMS Part A UNIT I INTRODUCTION TO DBMS 1. Define Data Independence (A/M-08, N/D-10) Data independence is the type of data transparency that matters for a centralized DBMS. It refers to the immunity of user applications to make changes in the definition and organization of data. 2. Differentiate primary key from candidate key (A/M-08) Candidate Key A Candidate Key can be any column or a combination of columns that can qualify as unique key in database. There can be multiple Candidate Keys in one table. Each Candidate Key can qualify as Primary Key. Primary Key A Primary Key is a column or a combination of columns that uniquely identify a record. Only one Candidate Key can be Primary Key. 3. Write a weak entity in an ER diagram. (A/M-08) Entity types that do not have key attributes of their own are called weak entity types. Weak entity set is depicted by double rectangles. Underline the discriminator of a weak entity set with a dashed line. 4. What is meant by referential integrity? (A/M-08) A referential integrity is defined as a record in one file that must be related to records in another file. For example, every section record must be related to course record.

Upload: trantu

Post on 28-Apr-2018

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

DHANALAKSHMI COLLEGE OF ENGINEERING

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

CS6302 DATABASE MANAGEMENT SYSTEMS

Part A

UNIT I INTRODUCTION TO DBMS

1. Define Data Independence (A/M-08, N/D-10)

Data independence is the type of data transparency that matters for a

centralized DBMS. It refers to the immunity of user applications to make changes in the

definition and organization of data.

2. Differentiate primary key from candidate key (A/M-08) Candidate Key – A Candidate Key can be any column or a combination of columns that can

qualify as unique key in database. There can be multiple Candidate Keys in one table. Each Candidate

Key can qualify as Primary Key. Primary Key – A Primary Key is a column or a combination of columns that uniquely identify a

record. Only one Candidate Key can be Primary Key.

3. Write a weak entity in an ER diagram. (A/M-08)

Entity types that do not have key attributes of their own are called weak entity types.

Weak entity set is depicted by double rectangles. Underline the discriminator of a weak

entity set with a dashed line.

4. What is meant by referential integrity? (A/M-08)

A referential integrity is defined as a record in one file that must be related to

records in another file. For example, every section record must be related to course

record.

Page 2: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

5. What is meant by domain integrity? (A/M-08)

Domain constraints specify that within each tuple, the value of each attribute A

must be an atomic value from the domain dom (A).

6. Write five responsibilities of the DB manager. (M/J-07)

A person who has such central control over the system is called a DataBase

Administrator (DBA).

The functions of a DBA include:

Schema definition

Storage structure and access-method definition

Schema and physical-organization modification

Granting of authorization for data access

Routine maintenance

7. Write the limitations of E-R model. (M/J-07)

The limitations of E-R model are:

There is no industry standard notation for developing an E-R diagram

The E-R data model is especially popular for high level

8. Define −Super Key (N/D-06)

Super Key - Super key stands for superset of a key. A Super Key is a set of one or more

attributes that are taken collectively and can identify all other attributes uniquely.

9. Write any two advantages of database systems. (N/D-07)

Advantages of Database Systems are:

Controlling redundancy

Restricting Unauthorized Access

Providing persistent storage for program objects

Providing backup and recovery

Providing Multiple users interfaces

10. Write the reasons why null values might be introduced into the database. (N/D-07)

SQL allows NULLs as attribute values, a constraint NOT NULL may be specified

if NULL is not permitted for a particular attribute.

11. Write the basic structure of a relational database with an example. (A/M-10)

A relational database consists of a collection of tables, each having a unique name.

A row in a table represents a relationship among a set of values. Thus a table represents

a collection of relationships. There is a direct correspondence between the concept of a table and

the mathematical concept of a relation. A substantial theory has been developed for relational

databases.

Page 3: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

12. What are the different types of data model? (M/J 2012)

Data model is a collection of concepts that can be used to describe the structure of

a data base.

Types:

High level or Conceptual Data model

Low level or Physical Data model

Entity Relationship Model

Representational Model

Relational data model

Network and Hierarchical Data Model

Object Data Model

Record based data model

13. What is query language? Write the classification of the query language. (M/J-07)

A query language is a language in which a user requests information from a

database. These are typically higher-level than programming languages.

They may be classified as:

Procedural, where the user instructs the system to perform a sequence of

operations on the database, which will compute the desired information

Nonprocedural, where the user specifies the information desired without

giving a procedure for obtaining the information

14. What are the different types of integrity constraints used in designing a relational

database? (N/D-07)

Constraint is a rule that is used for optimization purposes.

There are five types of constraints:

A NOT NULL constraint is a rule that prevents null values from being

entered into one or more columns within a table.

A unique constraint (also referred to as a unique key constraint) is a rule

that forbids duplicate values in one or more columns within a table.

Unique and primary keys are the supported unique constraints.

A primary key constraint is a column or combination of columns that has

the same properties as a unique constraint.

A foreign key constraint (also referred to as a referential constraint or

a referential integrity constraint) is a logical rule about values in one or

more columns in one or more tables.

A check constraint (simply called a check constraint) sets restrictions on

data added to a specific table.

15. Why is it necessary to decompose a relation? (M/J-07)

To avoid redundancy, we decompose a relation. There are two types of decomposition.

Loss less join decomposition

Lossy decomposition

Page 4: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

16. Write a SQL statement to find the names and loan numbers of all customers who have a

loan at Chennai branch. (N/D-06)

SQL>Select name and loanno from customer, loan where branch=Chennai

17. What is static SQL? How does it differ from dynamic SQL? (N/D-07)

The embedded SQL Example is static SQL. It is called static SQL because the

SQL statements in the program are static; that is, they do not change each time the

program is run.

18. Write the usage of rename operation. (A/M-10)

Rename operation can rename either the relation name or the attribute names, or both as a

unary operator. The general rename operation when applied to a relation r of degree n is

denoted by any of the following three forms:

Ρs(b1, b2, ..., bn)(r) or ρs(r) or ρ(b1, b2, ..., bn)(r)

Where the symbol ρ (rho) is used to denote the rename operator, s is the new relation

Name, and b1, b2, ..., bn are the new attribute names.

19. List out the relational algebra operators. (N/D-10)

The relational algebra operators are :

Select

Project

Rename

20. What is meant by multivalued dependency? (A/M-06,N/D-12)

A multivalued dependency X→→Y specified on relation schema R,where X and Y are

both subsets of R, specifies the following constraint on any relation state r of R: If two tuples

t1 and t2 exist in r such that t1[X] = t2[X], then two tuples t3 and t4 should also exist in r

with the following properties,

Where we use Z to denote (R – (X ∪ Y))

■ t3[X] = t4[X] = t1[X] = t2[X].

■ t3[Y] = t1[Y] and t4[Y] = t2[Y].

■ t3[Z] = t2[Z] and t4[Z] = t1[Z].

Whenever X→→Y holds, we say that X multi determines Y. Because of the symmetry

in the definition, whenever X →→ Y holds in R, so does X →→ Z. Hence, X →→ Y

implies X→→Z, and therefore it is sometimes written as X→→Y|Z.

21. Why are certain functional dependencies called trivial functional dependencies?

(M/J-12)

A trivial functional dependency occurs when we describe a functional dependency

of an attribute on a collection of attributes that include the original attribute. This type of

functional dependency is called trivial because it can be derived from common sense. It is

obvious that if you already know the value of B, then the value of B can be uniquely

determined by that knowledge.

For example, “{A, B} -> B” is a trivial functional dependency, as is “{name, SSN} ->

SSN”.

Page 5: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

22. Write an example of a relation schema R and a set of dependencies such that R is in

BCNF, but not in 4NF. (M/J-12)

Given the relation R=(A,B,C,D) and the set of functional dependencies

F’=A->B, C->D, B->C allows three distinct BCNF decompositions.

R1= {(A,B),(C,D),(B,C)} is in BCNF as is

R2={(A,B),(C,D),(A,C)}

R3={(A,B),(C,D),(A,C)}

R4={(B,C),(A,D),(A,B)}

23. What is meant by Normalization? (A/M-10)

Database normalization is the process of organizing the fields and tables of

a relational database to minimize redundancy. Normalization usually involves dividing

large tables into smaller (and less redundant) tables and defining relationships between

them. The objective is to isolate data so that additions, deletions, and modifications of a

field can be made in just one table and then propagated through the rest of the database

using the defined relationships.

24. Write a note on functional dependencies. (A/M-10)

A functional dependency, denoted by X → Y, between two sets of attributes X and Y that

are subsets of R specifies a constraint on the possible tuples that can form a relation state r of

R. The constraint is that, for any two tuples t1 and t2 in r that have t1[X] = t2[X], they must

also have t1[Y] = t2[Y].

This means that the values of the Y component of a tuple in r depend on, or are

determined by, the values of the X component; alternatively, the values of the X component

of a tuple uniquely (or functionally) determine the values of the Y component.

25. Define - Irreducible Set of Dependencies. (N/D-10)

A functional depending set S is irreducible if the set has the following three properties:

Each right set of a functional dependency of S contains only one attribute

Each left set of a functional dependency of S is irreducible. It means that

reducing any one attribute from left set will change the content of S (S will

lose some information)

Reducing any functional dependency will change the content of S

26. Define 3NF (N/D-10)

Third normal form (3NF) is based on the concept of transitive dependency. A functional

dependency X→Y in a relation schema R is a transitive dependency if there exists a set of

attributes Z in R that is neither a candidate key nor a subset of any key of R,10 and both

X→Z and Z→Y hold.

Page 6: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

UNIT II- SQL AND QUERY OPTIMIZATION

Part – A

1. Explain the following :

i) DDL ii) DML

DDL:

Data base schema is specified by a set of definitions expressed by a special

language called a data definition language.

Example: Create, Alter, Truncate and Drop

DML:

A data manipulation language is a language that enables users to access or

manipulate data as organized by the appropriate data model.

Example: Select, Insert, Update, Delete

2. Write the two types of embedded SQL SELECT statements. (N/D – 11)

The two types of embedded SQL SELECT statements are,

i) Singleton: It can retrieve only one row of sql data.

ii) Cursor: It is a temporary working area used to store the data retrieve from the

database and manipulate the data. It can hold more than one row but process only

one at a time.

3. What is meant by embedded SQL? What are its advantages? (A/M – 11) Embedded SQL is a method of combining the computing power of a programming

language and the database manipulation capabilities of SQL. Embedded SQL statements are

SQL statements written in line with the program source code of the host language. The

embedded SQL statements are parsed by an embedded SQL pre-processor and replaced by

host-language calls to a code library. The output from the pre-processor is then compiled by

the host compiler. This allows programmers to embed SQL statements in programs written in

any number of languages such as: C/C++, COBOL and FORTRAN.

4. What are the parts of SQL language? The SQL language has several parts are,

Data definition language

Data manipulation language

View definition

Transaction control

Embedded SQL

Integrity

Authorization

5. What are the categories of SQL command? SQL commands are divided in to the following categories:

Data Definition Language

Data Manipulation Language

Data Query Language

Data Control Language

Page 7: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

Data Administration Statements

Transaction Control Statements

6. What are the three clauses of SQL expression? SQL expression consists of three clauses are:

Select

From

Where

7. What is the use of sub queries? A sub query is a select-from-where expression that is nested within another query.

A common use of sub queries is to perform tests for set membership, make set comparisons,

and determine set cardinality

8. What is query processing? Query processing refers to the range of activities involved in extracting data from

a database.

9. What are the steps involved in query processing? The steps involved in query processing are,

Parsing and translation

Optimization and

Evaluation.

10. What is an evaluation primitive? A relational algebra operation annotated with instructions on how to evaluate is called an

evaluation primitive.

11. What is a query evaluation plan? A sequence of primitive operations that can be used to evaluate by query is a query

evaluation plan or a query execution plan.

12. What is a query – execution engine? The query execution engine takes a query evaluation plan, executes that plan, and returns

the answers to the query.

13. Define Query Optimization Query optimization refers to the process of finding the lowest –cost method of evaluating

a given query.

14. What are the data types of SQL?

Data Type Description

CHARACTER(n) Character string, Fixed-length n

VARCHAR(n) Character string, Variable length ,Maximum length n

BINARY(n) Binary string, Fixed-length n

BOOLEAN Stores TRUE or FALSE values

Page 8: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

INTEGER Integer numerical(no decimal),Precision p

15. What are the commands in DDL?

Data definition language (DDL) commands enable you to perform the following tasks:

Create, alter, and drop schema objects

Grant and revoke privileges and roles

Add comments to the data dictionary

The CREATE, ALTER, and DROP commands require exclusive access to the object being

acted upon. For example, an ALTER TABLE command fails if another user has an open

transaction on the specified table.

16. Differentiate Static from Dynamic SQL? Most application programs are designed to process static SQL statements and fixed

transactions. In this case, you know the makeup of each SQL statement and transaction

before runtime; that is, you know which SQL commands will be issued, which database

tables might be changed, which columns will be updated, and so on.

However, some applications might be required to accept and process any valid SQL

statement at runtime. So, you might not know until runtime all the SQL commands, database

tables, and columns involved.

Dynamic SQL is an advanced programming technique that lets your program accept or

build SQL statements at run time and take explicit control over data type conversion.

Page 9: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

UNIT-III TRANSACTION PROCESSING AND CONCURRENCY CONTROL

Part - A

1. What is transaction? (A/M − 10)

Transaction is a collection of operations that form a single logical unit of work. During

the transaction in execution, the database may be inconsistent.When the transaction is

committed, the database must be consistent.

2. List out the SQL statements used for transaction control. (N/D – 11)

The SQL standard specifies that a transaction begin implicitly. Transactions are ended by

one of these SQL statements:

a. Commit work commits the current transaction and begins a new one

b. Rollback work causes the current transaction to abort

3. What is a transaction rolled back?

Any changes that the aborted transaction made to the database must be undone. Once the

changes caused by an aborted transaction have been undone, then the transaction is rolled

back.

4. What are the states of a transaction?

The states of transaction are:

a. active

b. partially committed

c. failed

d aborted

e committed

f terminated

5. Define ACID Properties (A/M – 10)

ACID properties can be defined to ensure integrity of the data in the database system that

maintains the following properties of the transactions:

a. Atomicity: Either all operations of the transaction are reflected properly in the

database, or none.

b. Consistency: Execution of a transaction in isolation [that is, with no other transaction

executing concurrently] preserves the consistency of the database.

c. Isolation: Even though multiple transactions may execute concurrently, the system

guarantees that, for every pair of transactions Ti and Tj, it appears to Ti that either Tj

finished execution before Ti started, or Tjstarted execution after Ti finished.

d. Durability: After a transaction completes successfully, the changes it has made to the

database persist, even if there are system failures.

6. What are the properties of transaction?

The properties of transactions are

a. Atomicity

b. Consistency

c. Isolation

d. Durability

Page 10: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

7. Write the two commonly used concurrency control techniques. (N/D – 11)

Commonly used concurrency control techniques are:

a. Two-Phase locking

b. Concurrency control based on Timestamp ordering

c. Multi-version Concurrency Control techniques

d. Lock Compatibility Matrix

e. Lock Granularity

8. Give the reasons for allowing concurrency.

The reasons for allowing concurrency are that if the transactions run serially, a short

transaction may have to wait for a preceding long transaction to complete, which can lead to

unpredictable delays in running a transaction. So the concurrent execution reduces the

unpredictable delays in running transactions.

9. What are the three kinds of intent locks? (N/D – 10)

The three kinds of intent locks are:

a. Intent share

b. Intent exclusive

c. Share with intent exclusive

10. What are two pitfalls of lock-based protocols? (M/J–11)

The two pitfalls of lock-based protocols are:

a. Deadlock

b. Starvation

11. Define Lock

Lock is a variable associated with a data item that describes the status of the item with

respect to possible operations that can applied to it.

One lock with each data item.

The lock is used to synchronize the access to the data item.

12. What are the different modes of lock?

The different modes of lock are:

a. shared

b. exclusive

Read_locked (shared lock): the item is locked for read purpose and can be shared for

reading by another transaction

Write_locked (exclusive lock): the item is locked for write purpose and cannot be

accessed by another transaction

13. What are the advantages of two phase locking protocol? (M/J–12)

The advantages of two phase locking protocol are:

produces only cascade less schedules

Recovery is very easy

14. What are the phases of two Phase locking protocol?

The phases of two phase locking protocol are,

Page 11: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

Growing phase: A transaction may obtain locks but not release any lock and

Shrinking phase: A transaction may release locks but may not obtain any new locks

15. What are the two types of serializability?

The two types of serializability are:

conflict serializability

view serializability

16. Define Deadlock

Neither of the transaction can ever proceed with its normal execution. This situation is

called deadlock.

17. What are the two methods for dealing with deadlock problem?

The two methods for dealing with deadlock problem are,

a. Deadlock detection

b. Deadlock recovery

Page 12: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

UNIT-IV TRENDS IN DATABASE TECHNOLOGY

Part – A

1. What is meant by flash memory? [A/M − 10]

Flash memory also known as electrically erasable programmable read-only memory

(EEPROM), differs from main memory in that data survive power failure. Reading data from

flash memory takes less than 100 nanoseconds (a nanosecond is 1/1000 of a microsecond),

which is roughly as fast as reading data from main memory. Flash memory has found

popularity as a replacement for magnetic disks for storing small volumes of data.

2. List out the physical storage media. [A/M − 10]

The various physical storage media are:

a. Cache

b. Main memory

c. Flash memory

d. Magnetic disk storage

e. Optical storage

f. Tape storage

3. What is the drawback of flash memory? [N/D − 10]

A main drawback of flash memory is that it can support only a limited number of erase

cycles, ranging from 10,000 to 1 million. Another drawback is writing data to flash memory

which is more complicated data can be written once, which takes about 4 to 10 micro

seconds, but cannot be overwritten directly. To overwrite memory that has been written

already, we have to erase an entire bank of memory at once, which is then ready to be

written again.

4. What are the types of storage devices?

The types of storage devices are:

Primary storage

Secondary storage

Tertiary storage

Volatile storage and

Non-volatile storage

5. What is meant by RAID? [M/J – 13]

RAID (Redundant Array of Independent Disks) is a storage technology that combines

multiple disk drive components into a logical unit for the purposes of data redundancy and

performance improvement. Data is distributed across the drives in one of several ways,

referred to as RAID levels, depending on the specific level of redundancy and performance

required.

Page 13: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

6. What is meant by mirroring?

The simplest approach to introduce redundancy is to duplicate every disk. This technique

is called mirroring or shadowing.

7. What is meant by bit-level striping?

Data striping consists of splitting the bits of each byte across multiple disks. This is called

bit-level striping. In an array of eight disks, write bit i of each byte to disk i. Each access can

read data at eight times the rate of a single disk.

8. What is meant by block-level striping?

Block level striping stripes blocks across multiple disks. It treats the array of disks as a

large disk, and gives blocks logical numbers. Requests for different blocks can run in parallel

if the blocks reside on different disks.

9. What are the factors to be taken into account when choosing a RAID level?

The factors to be taken into account when choosing a RAID level are,

a. Monetary cost of extra disk storage requirements

b. Performance requirements in terms of number of i/o operations

c. Performance when a disk has failed and

d. Performances during rebuild.

10. What are the advantages and disadvantages of indexed sequential files? [M/J − 11]

Advantages of Indexed Sequential Files

Allows records to be accessed directly or sequentially.

Direct access ability provides vastly superior (average) access times.

Disadvantages of Indexed Sequential Files

The fact that several tables must be stored for the index makes for a considerable

storage overhead.

As the items are stored in a sequential fashion this adds complexity to the

addition/deletion of records. Because frequent updating can be very inefficient,

especially for large files, batch updates are often performed.

11. What are the ways in which the variable-length records arise in database systems?

The ways in which the variable-length records arise in database systems are,

a. Storage of multiple record types in a file

b. Record types that allow variable lengths for one or more fields and

c. Record types that allow repeating fields.

Page 14: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

12. What is heap file organization?

In the heap file organization, any record can be placed anywhere in the file where there is

space for the record. There is no ordering of records and single file for each relation.

13. What is sequential file organization?

In the sequential file organization, the records are stored in sequential order, according to

the value of a “search key” of each record.

14. What is hashing file organization?

In the hashing file organization, a hash function is computed on some attribute of each

record. The result of the hash function specifies in which block of the file the record should be

placed.

15. What is clustering file organization?

In the clustering file organization, records of several different relations are stored in the

same file.

16. What is an index?

An index is a structure that helps to locate desired records of a relation quickly, without

examining all records.

17. What are the types of indices?

The types of indices are,

Ordered indices and

Hash indices

Ordered indices: search keys are stored in sorted order

Hash indices: search keys are distributed uniformly across “buckets” using a “hash

function”.

18. What are the techniques to be evaluated for both ordered indexing and hashing?

The techniques to be evaluated for both ordered indexing and hashing are,

a. Access types

b. Access time

c. Insertion time

d. Deletion time and

e. Space overhead.

f.

19. What are called index-sequential files?

The files that are ordered sequentially with a primary index on the search key are called

index-sequential files.

Page 15: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

20. Which are the factors to be considered for the evaluation of indexing techniques?

[N/D − 10]

The following factors are considered for the evaluation of indexing techniques:

a. Access types supported efficiently

Example:

i. Records with a specified value in the attribute (or)

ii. Records with an attribute value falling in a specified range of values

b. Access time

c. Insertion time

d. Deletion time

e. Space overhead

21. What are ordered indices? [N/D − 11]

The records in the indexed file may themselves be stored in some sorted order, just as

books in a library are stored according to some attribute such as the Dewey decimal number

is called ordered indices.

22. Distinguish between dense index and sparse index. [N/D − 11]

Dense Index Sparse Index

An index record appears for every

search-key value in the file.

An index record appears for only some

of the search-key values.

The index record contains the search-key

value and a pointer to the first data record

with that search-key value. The rest of

the records with the same search key-

value would be stored sequentially after

the first record.

Each index record contains a search-

key value and a pointer to the first data

record with that search-key value. To

locate a record, we find the index entry

with the largest search-key value that is

less than or equal to the search-key

value.

23. When is it preferable to use a dense index rather than a sparse index? [M/J − 12]

It is preferable to use a dense index instead of a sparse index when the file is not sorted

on the indexed field (such as when the index is a secondary index) or when the index file is

small compared to the size of memory.

24. What is B-Tree index?

A B-tree eliminates the redundant storage of search-key values. It allows search key

values to appear only once. B-tree maintains their efficiency despite insertion and deletion of

data.

Page 16: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

25. What is a B+-Tree index?

A B+-Tree index takes the form of a balanced tree in which every path from the root of

the root of the root of the tree to a leaf of the tree is of the same length.

26. Mention the different hashing techniques. [M/J − 12]

The different hashing techniques are:

a. Closed hashing

b. Dynamic hashing

c. Extendable hashing

27. Differentiate static hashing from dynamic hashing? [M/J − 13]

In static hashing that required to obtain the address of the disk block containing a desired

record directly by computing a function on the search-key value of the record.

In dynamic hashing allows the hash function to be modified dynamically to accommodate

the growth or shrinkage of the database.

Page 17: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

UNIT V ADVANCED TOPICS

1. What are the types of security?

Database security is a broad area that addresses many issues, including the following:

Various legal and ethical issues regarding the right to access certain information

Policy issues at the governmental, institutional, or corporate level as to what

kinds of information should not be made publicly available

System-related issues such as the system levels at which various security

functions should be enforced

The need in some organizations to identify multiple security levels and to

categorize the data and users based on these classifications

2. Define Statistical Database Security

Statistical databases are used mainly to produce statistics about various

populations. The database may contain confidential data about individuals, which should

be protected from user access. However, users are permitted to retrieve statistical

information about the populations, such as averages, sums, counts, maximums,

minimums, and standard deviations.

3. What are the types of privileges?

There are two levels for assigning privileges to use the database system:

The account level. At this level, the DBA specifies the particular privileges

that each account holds independently of the relations in the database

The relation (or table) level. At this level, the DBA can control the privilege

to access each individual relation or view in the database

4. What is meant by code injection?

Code injection is one type of attack attempt to add additional SQL statements or

commands to the existing SQL statement by exploiting a computer bug, which is caused

by processing invalid data. The attacker can inject or introduce code into a computer

program to change the course of execution. Code injection is a popular technique for

system hacking or cracking to gain information.

5. What is meant by function call injection?

In function call injection attack, a database function or operating system function

call is inserted into a vulnerable SQL statement to manipulate the data or make a

privileged system call. For example, it is possible to exploit a function that performs

some aspect related to network communication. In addition, functions that are contained

in a customized database package, or any custom database function, can be executed as

part of an SQL query. In particular, dynamically created SQL queries can be exploited

since they are constructed at run time.

Page 18: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

6. What are the risks associated with SQL injection?

The risks associated with SQL injection attacks are:

Database fingerprinting

Denial of service

Bypassing Authentication

Identifying injectable parameters

Executing Remote Commands

Performing Privilege Escalation

7. What is meant by covert channels?

A covert channel allows a transfer of information that violates the security or the

policy. Specifically, a covert channel allows information to pass from a higher

classification level to a lower classification level through improper means. Covert

channels can be classified into two broad categories:

Timing channels and

Storage

8. Define Encryption

Encryption is the conversion of data into a form, called a cipher text, which

cannot be easily understood by unauthorized persons. It enhances security and privacy

when access controls are bypassed, because in cases of data loss or theft, encrypted data

cannot be easily understood by unauthorized persons.

9. Define Decryption

Decryption is the process of transforming cipher text back into plaintext.

10. What is content encryption algorithm?

A message encrypted with a secret key can be decrypted only with the same secret

key. Algorithms used for symmetric key encryption are called secret-key algorithms.

Since secret-key algorithms are mostly used for encrypting the content of a message, they

are also called content encryption algorithm.

11. Define Digital Signature

A digital signature is an example of using encryption techniques to provide

authentication services in electronic commerce applications. Like a handwritten

signature, a digital signature is a means of associating a mark unique to an individual

with a body of text. The mark should be unforgettable, meaning that others should be

able to check that the signature comes from the originator.

12. What is meant by granting privileges?

The DBA’s responsibilities include granting privileges to users who need to use

the system and classifying users and data in accordance with the policy of the

organization.

Page 19: DHANALAKSHMI COLLEGE OF ENGINEERING … · UNIT I INTRODUCTION TO DBMS 1. ... (M/J-07) The limitations of E-R model are: There is no industry standard notation for developing an E-R

13. What is meant by revoking privileges?

In SQL a REVOKE command is included for the purpose of canceling privileges.

Now suppose that A1 decides to revoke the SELECT privilege on the EMPLOYEE

relation from A3; A1 then can issue this command:

REVOKE SELECT ON EMPLOYEE FROM A3;

14. List out the types of privileges available in SQL.

Types of Privileges:

SELECT (retrieval or read) privilege on R

Modification privileges on R

References privilege on R

15. Define Data Mining

Data mining can be used in conjunction with a data warehouse to help with

certain types of decisions. Data mining can be applied to operational databases with

individual transactions. To make data mining more efficient, the data warehouse should

have an aggregated or summarized collection of data. Data mining helps in extracting

meaningful new patterns that cannot necessarily be found by merely querying or

processing data or metadata in the data warehouse.

16. What is meant by Knowledge Discovery?

Knowledge Discovery in Databases, frequently abbreviated as KDD, typically

encompasses more than data mining. The knowledge discovery process comprises of six

phases: data selection, data cleansing, enrichment, data transformation or encoding, data

mining, and the reporting and display of the discovered information.

17. Define Clustering

Cluster analysis or clustering is the task of grouping a set of objects in such a way

that objects in the same group are more similar to each other than to those in other

groups.

18. Define Distributed Database

A Distributed DataBase (DDB) is a collection of multiple logically interrelated

databases distributed over a computer network

19. What is meant by DDBMS?

A Distributed Database Management System (DDBMS) as a software system that

manages a distributed database while making the distribution transparent to the user.

20. What are the types of transparency in DDB?

The types of transparency are,

Data organization transparency

Replication transparency

Fragmentation transparency

Design transparency