database book report

Upload: justseemetoo

Post on 09-Apr-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Database Book Report

    1/26

    Chapter 1

    Introduction to Database Management Systems

    A database system is one of the critical components to buildapplications. It provides an array of features which can be sued toensure optimal utilization of data for enhancing decision effectivenessin organizations. This chapter gives an introduction to the mainconcepts used in database systems. It highlights the advantages ofusing database system to operational data.

    1.1 Database

    Databases have become an integral part of modern life. Eachday we come across situations that involve some interaction with adatabase. For example, activities like going to the bank for depositingor withdrawing funds, or making an airline/railway reservation, willentail interaction with relevant databases.

    A database is a collection of related data stored in astandardized format, capable of being shared by multiple users. Here,data implies known facts that have implicit meaning. For example,

    consider the names, addresses, departments of employees store in anemployee file. It is a collection of related data with an implicit meaningand hence is a database.

    A Database Management System (DBMS) is software that allowsusers to create and maintain a database. A DBMS enables the user to

    Specify the data types, structures and constraints for the data;

    Store the data and Manipulate a database by performing function such as querying

    the database, updating the database, and generating reports

    from the data stores in the database.

    The database and the associated DBMS software together areknown as the database system. The four basic elements of databaseare:

    Data

    1

  • 8/8/2019 Database Book Report

    2/26

    Relationships

    Constraints

    Schema

    Data is the plural of datum- a single piece of information. The

    structure of the records in a database can be specified by usingdifferent types of data elements. For example, each employee recordincludes data elements to represent the employees name, address,department, etc. The database system not only contains the databasebut also a complete definition of the database structure. The systemcatalog stores this definition. It contains information about thestructure of each file, the type of each data item, and variousconstraints on the data. The information stored in the catalog is calledmeta-data.

    Relationship is mapping of association between data elements.

    Constraints are used to maintain the accuracy or correctness of thedata in the database. A schema demarcates the physical aspects ofdata storage from the logical aspects of data representation. It definesdifferent views of the database for diverse system components. Theinternal schema depicts how the data is stored physically. The externalschema is useful to the users as it defines views of the database forindividual users. The conceptual schema conceals the details ofphysical storage structures and stresses on describing data types,relationships, operations and constraints. The data are stored in such amanner that they are not dependent on the user programs. The threeschema discussed above make it easier to achieve true data

    independence. When the schema is altered at some level, the schemaat the next level remains unaltered. A database is organized in such amanner that a program can access desired elements of data quickly.Traditional databases are structured by fields, records, and files. A fieldis a single element of information; a record is a complete collection offields; and a file is a collection of records. A database managementsystem (DBMS) is required to access required data from a database.

    1.2 Flat model

    The flat (or table) model consists of a single, two-dimensional arrayof data elements, where all members of a given column are assumedto be similar values, and all members of a row are assumed to berelated to one another. For instance, columns for name and passwordthat might be used as a part of a system security database. Each row

    2

  • 8/8/2019 Database Book Report

    3/26

    would have the specific password associated with an individual user.Columns of the table often have a type associated with them, definingthem as character data, date or time information, integers, or floatingpoint numbers. This may not strictly qualify as a data model, asdefined above.

    Figure 1.1 Flat File Model

    1.3 Hierarchical Database Model

    In a hierarchical database model, the data elements are linked inthe form of an inverted tree structure with the root at the top and thebranches formed below. Below the single root data element aresubordinate elements, each of which, in turn, has one or more otherelements. There is a parent child relationship among the dataelements of a hierarchical database. There may be many childelements under each parent element, but there can be only one parentelement for any child element. The branches in the tree are notconnected.

    Hierarchical data model is used in several database applicationsbecause the data elements of many applications can be neatly

    3

    Organization

    Personal

    Department

    Technical

    Department

    Manager

    sSupport staff-

    Support

    staff

    Technicia

    n

    Engineer

    s

    Manager

    s

    A parent segment

    Figure 1.2 An example of hierarchical database

  • 8/8/2019 Database Book Report

    4/26

    organized in the form of hierarchical tree structure. The main limitationof this structure is that it does not support flexible data access,because data can be accessed only by following the path down thetree structure.

    Advantages: It is the easiest model of database.

    It is secure model as nobody can modify the child withoutconsulting to its parent.

    Searching is fast and easy is parent is known.

    Very efficient in handling one to many relationship.

    Disadvantages:o It is old fashion, outdated database modelo Modification and addition of child with out consulting its parent is

    impossible.o Cannot handle many to many relationships.o Increase redundancy.o It does not support flexible data access, because data can be

    accessed only by following the path down the tree structure.

    1.4Network Database Model

    A network database model is an extension of the hierarchicaldatabase structure. In this model also, the data elements of a databaseare organized in the form of parent-child relationships and all types ofrelationships among the data elements must be determined when thedatabase is first designed. In a network database, a child data elementcan have more than one parent element or no parent at all. Moreover,in this type of database, the database management system permitsthe extraction of the needed information from any data element in thedatabase structure, instead of starting from the root data element.

    4

  • 8/8/2019 Database Book Report

    5/26

    Advantages:

    More flexible than hierarchical database because it accept manyto many relationship.

    Searching is faster because of multidirectional pointers.

    Promotes database integrity

    Data independence

    Disadvantages:o Complex type of database model.o Less secure than hierarchical as it is open to all.o Need long program to handle the relationship.

    1.5 Relational Database Model

    In a relational database model, the data elements are organized inthe form of multiple tables with rows and columns. Each table of thedatabase is stored as a separate file. Each table column represents adata field and each row represents a data record. A record is alsoknown as a tuple. The data in one table is related to a data in anothertable with a common field.

    5

    College

    English Math Computer

    Sita

    Account

    Geeta Rita RamMita Shyam

    This child element has no parent

    element

    Figure 1.3 An example of a network database

  • 8/8/2019 Database Book Report

    6/26

    Figure 1.4 An example of relational database.

    The relational database model provides greater flexibility of dataorganization and future enhancements in the database as compared tothe hierarchical and network database models. If a new data is to beadded to an existing relational database, it is not necessary toredesign the database. Rather new table containing the new data canbe added to the database and then these tables can be related to theexisting tables with common key fields.

    Advantages:

    Very less redundancy. Normalization of database is possible

    Quick database processing is possible

    Since one table will link with another with common key field, therule implemented in one table can easily be implemented inanother.

    Disadvantages:

    6

  • 8/8/2019 Database Book Report

    7/26

    o It is more complex than other models.o Too many rules makes database non - user friendly.

    1.6 DBMS

    Data

    Data is commonly defined as raw facts or observation, typicallyabout physical phenomena or business transactions.

    Example of data would be the marks obtained by students indifferent subjects.

    Data can be in any form: numerical (mathematically transformable) textual (more correctly, alphanumerical) graphical (components are known entities; mathematicallydescribable) image: fixed (reflection, photograph) image: moving (video) sound

    Information

    Information is defined as refined or processed data that has beentransformed into meaningful and useful form for specific users. Forexample, after processing the marks obtained by student ittransformed into information, which is meaningful and from which wecan decide which student stood first, second and so forth.

    Information comes from data and takes the form of table,graphs, diagrams etc.

    Database

    A database is a collection of related data. By data, we meanknown facts that can be recorded and that have implicit meaning. Fore.g. consider the name, telephone numbers and addresses of peopleyou know. You may have recorded this data in an indexed addressbook, or you may have stored it on hard disk drives, using MicrosoftAccess or excel. This is a collection of related data with an implicitmeaning and hence is a database.

    7

  • 8/8/2019 Database Book Report

    8/26

    A database is designed, built, and populated with data forspecific purpose.

    Database management system (DBMS)

    It is a collection of interrelated data and a set of programs toaccess the data

    A database management system (DBMS) is a collection ofprograms that enables users to create and maintain a database. TheDBMS is hence a general purpose software system that facilitates theprocess of defining, constructing, manipulating, and sharing databasesamong various users and applications.

    Defining a database involves specifying the data types,structures, and constraints for the data to be stored in thedatabase.

    Constructing the database is the process of storing the data itselfon some storage medium that is controlled by the DBMS.

    Manipulating a database includes such functions as querying thedatabase to retrieve specific data, updating the database andgenerating reports from the data.

    Sharing a database allows multiple users and programs to accessthe database concurrently.

    Examples of DBMS are MS Access, Oracle, and MYSQL etc.

    Database system

    A database system consists of database (DB) + databaseManagement system (DBMS) + application program. A databasesystem is just a computerized record keeping system. Database is arepository for a collection of computerized data files. Users of databasesystem can perform a variety of operation on such file.

    Database System involve four major components Data Hardware Software

    UserObjective of DBMS: To provide large space or storage for relevant data. To provide easy access to the data for the users. To provide quick response to user request for any data. To remove duplicate (redundant) data. To update the database latest modification immediately.

    8

  • 8/8/2019 Database Book Report

    9/26

    To allow the multiple users to be active at one time. As the organization grows, DBMS allows the growth of the

    database system. To provide maximum protection to data from any physical

    damage and unauthorized access.

    Database Applications:

    Banking: all transactions

    Airlines: reservations, schedules

    Universities: registration, grades

    Sales: customers, products, purchases Online retailers: order tracking, customized recommendations Manufacturing: production, inventory, orders, supply chain

    Human resources: employee records, salaries, tax deductions

    Databases touch all aspects of our livesSome of the Database are:1. Oracle2. Sybase3. Microsoft SQL Server4. DB2 (IBM)5. MySQL6. Postgres7. DBASE8. Ms-Access

    1.7 Relational DBMS

    Edgar Codd worked at IBM in San Jose, California, in one of theiroffshoot offices that was primarily involved in the development of harddisk systems. He was unhappy with the navigational model of theCodasyl approach, notably the lack of a "search" facility. In 1970, hewrote a number of papers that outlined a new approach to databaseconstruction that eventually culminated in the groundbreaking A

    Relational Model of Data for Large Shared Data Banks. In this paper, hedescribed a new system for storing and working with large databases.Instead of records being stored in some sort of linked list of free-formrecords as in Codasyl, Codd's idea was to use a "table" of fixed-lengthrecords. A linked-list system would be very inefficient when storing"sparse" databases where some of the data for any one record couldbe left empty. The relational model solved this by splitting the datainto a series of normalized tables, with optional elements being moved

    9

  • 8/8/2019 Database Book Report

    10/26

  • 8/8/2019 Database Book Report

    11/26

    Multics, and now there are two new implementations: AlphoraDataphor and Rel. All other DBMS implementations usually calledrelational are actually SQL DBMSs. In 1968, the University of Michiganbegan development of the Micro DBMS relational databasemanagement system. It was used to manage very large data sets by

    the US Department of Labor, the Environmental Protection Agency andresearchers from University of Alberta, the University of Michigan andWayne State University. It ran on mainframe computers using MichiganTerminal System. The system remained in production until 1996.

    Figure 1.5 In the relational model, related records are linkedtogether with a key.

    1.8 Features of Data in a Database

    The data in a database should possess the following features:

    It should be sharable among different users and applications.

    It should be valid or correct with respect to the real world entitythat they represent.

    It should be protected from unauthorized access and be secured.

    The consistency of the data should be maintained.

    It should be non-redundant. No two data items in a databaseshould represent the same entity.

    Data should be independent of the application, i.e. the DBA mustbe able to change the storage structure or access technique

    according to the changing requirement, without affecting theexisting applications.

    Chapter 2

    Database System Concepts and Architecture

    11

  • 8/8/2019 Database Book Report

    12/26

    In the database approach, the database approach, the details ofdata storage that are not required by most database users areconcealed. A data model, which describe the structure of a database,helps in achieving this abstraction. The structure of a database impliesthe data types, relationships, and constraints-the conceptual

    constructs that characterize the data. This chapter gives an insight intothe various data models and explores the advantages anddisadvantages of each of the data models.

    2.1 Data Models and Their Categories

    Data modeling helps in the understanding of the meaning of thedata. Building a data model requires careful analysis of the entities,

    relationships and attributes and the designer does precisely the same.Designer discovers the semantics of the organizations data.

    The data is modeled to ensure that

    Each users view of the data is understood.

    The nature of the data, independent of its physicalrepresentations, is clearly understood.

    The use of data across applications is perceived correctly.

    Data model is a means to present the designers understandingof the information needs of the organization. Organizations have

    resorted to having a standard way to model their data by following aspecific data modeling approach for all their database developmentprojects. The Entity-Relationship(ER) model is the most popular high-level conceptual data model. This model is commonly used for theconceptual design of database applications.

    A data model should have the following characteristics: INTEGRITYConformance with organizations way of using

    and managing information. STRUCTURAL VALIDITY Consistency with the

    organizations mode of defining and organizing

    information. EXPRESSABILITY Capability to differentiate between

    data, relationships between data, and constraints. NON-REDUNDANCY Elimination of redundant data i.e.,

    representing a piece of information only once. SHAREABILITY Not confined to use by any particular

    application i.e., usable by multiple applications.

    12

  • 8/8/2019 Database Book Report

    13/26

    EXTENSIBILITY Ability to support new requirementswithout affecting the existing users.

    DIAGRAMMATIC REPRESENTATION Ability of beingrepresented using easily understood diagrammaticnotations.

    Constructs are used to define the database structure. Constructstypically include elements (and their data types) as well as groups ofelements (e.g. entity, record, table), and relationships among suchgroups. Constraints specify some restrictions on valid data; theseconstraints must be enforced at all times.

    Data Model operations are used for specifying databaseretrievals and updates by referring to the constructs of the data model.Operations on the data model may include basic model operations(e.g. generic, insert, delete, update) and user-defined operations (e.g.

    computer_student_gpa, updata_inventory).

    A Database Management System can take several approaches tomanage the data. Each approach constitutes a data model. It specifiesmechanisms for data storage and retrieval.

    A DBMS manipulates information from some real-worldapplication, regardless of the underlying database model. From theapplication perspective, different database models have a commongoal-to allow the storage and retrieval of information. The majordifference between the different database models lies in the modes of

    depicting constraints and relationships among the data elements. Thevarious data models that have been suggested fall under threedifferent categories:

    CONCEPTUAL (HIGH-LEVEL, SEMANTIC) DATAMODELSProvide concepts that are close to the way many usersperceive data. Also called entity-based or object-baseddata models.

    PHYSICAL(LOW-LEVEL, INTERNAL) DATA MODELS:

    Provide concepts that describe details of how data isstored in the computer. These are usually specified in anad-hoc manner through DBMS design and administrationmanuals.

    IMPLEMENTATION(REPRESENTATIONAL) DATAMODELS:

    13

  • 8/8/2019 Database Book Report

    14/26

    Provide concepts that fall between the above two, used bymany commercial DBMS implementations (e.g. relationaldata models used in many commercial systems).

    2.2 History of Data Models

    Network Model is the first network DBMS that wasimplemented by Honeywell in 1964-65(IDS System). It was adoptedheavily due to the support by CODASYL (Conference on Data SystemsLanguage) and later implemented in a large variety of systems such asIDMS(Cullinet-now Computer Associated), DMS 1100(Unisys),IMAGE(HP Hewlett-Packard), VAX-DBMS(Digital Equipment Corp., nextCOMPAQ, now HP). Network Model is able to model complex

    relationships and represents semantics of add/delete on therelationships. It can handle most situations for modeling using recordtypes and relationship types. Language is navigational which usesconstructs like FIND, FIND member, FIND owner, FIND NEXT within set,GET, etc. Programmers can do optimal navigation through thedatabase. In the network model, data are represented by collection ofrecords, and relationships among data are represented by links.

    Figure 2.1 Example of Network Model Schema

    Network model has disadvantages:o Navigation and procedural nature of processing.o Database contains a complex array of pointer that

    thread through a set of records. It has little scope forautomated query optimization.

    14

    STUDENT

    GRADE_REPORT

    SECTION

    COURSE

    PREREQUISITE

    IS_A

    HAS_

    A

    COURSE_OFFERIN

    GS

    SECTION_GRA

    DES

    STUDENT_G

    RADES

  • 8/8/2019 Database Book Report

    15/26

    o Making structural modifications to the database is verydifficult in the network database model as the dataaccess method is navigational. Any changes made tothe database structure requires the applicationprograms to be modified before they can access data.

    Thought the network database model achieves dataindependence, it still fails to achieve structuralindependence.

    Hierarchical Data Model is one of oldest database models,dating from late 1950s. The hierarchical model is based on theassumption that a tree structure is the most frequently occurringrelationship. This assumption is not recognized in todays context. Inface many of the limitations and shortcomings of the hierarchicalmodel result from this overly restrictive view of relationships. Thehierarchical model organizes data elements as tabular rows, one for

    each instance of an entity. Consider a companys organizationalstructure. At the top we have a General Manager (GM). Under him wehave several Deputy General Managers (DGMs). Each DGM looks aftersome departments and each department will have a manager andmany employees. When represented in hierarchical model, there willbe separate rows for representing the GM, each DGM, eachDepartment, each Manager and each Employee. The row positionimplies a relationship to other rows. A given employee belongs to thedepartment that is closest above it in the list and the departmentbelongs to the manager that is immediately above it in the list and soon. The hierarchical model is identical to the network model in the

    sense that data and relationships among data are represented byrecord and links, respectively. You can locate set of employees workingfor say, Manager X by first locating Manager X and then includingevery employee in the list after X and before the next occurrence of amanager or the end of the list. As the linearized tree is an abstraction,the term logical proximity is more appropriate for hierarchical model.

    The hierarchical model had many advantages over the filesystems it replaced. It can be said that the advantages and features ofthe hierarchical database systems as the reason for the developmentsof the database models that followed. The main advantages this

    database model are:

    SIMPLICITYSince the database is based on the hierarchicalstructure, the relationship between the various layers islogically (conceptual) simple. Thus, the design of ahierarchical database is simple.

    15

  • 8/8/2019 Database Book Report

    16/26

    DATA SECURITYHierarchical model was the first databasemodel that offered the data security that is provided andenforced by the DBMS.

    DATA INTEGRITYSince the hierarchical model is based onthe parent/child relationship, there is always a link

    between the parent segment and the child segments underit. The child segments are always automatically referenceto its parent, this model promoted data integrity.

    EFFICIENCY The hierarchical database model is a veryefficient one when the database contains a large numberof 1:n relationships(one-to-many relationships) and whenthe users require number of transactions, using datawhose relationships are fixed.

    The main DISADVANTAGES of the hierarchical database modelare:

    IMPLEMENTATION COMPLEXITY Although the hierarchicaldatabase model is conceptually simple and easy to design,it is quite complex to implement. The database designersshould have very good knowledge of the physical datastorage characteristics.

    DATABASE MANAGEMENT PROBLEMS If you make anychanges in the database structure of a hierarchicaldatabase, then it is required to make the necessarychanges in all the application programs that access the

    database. Thus, maintaining the database and theapplications can become very cumbersome.

    LACK OF STRUCTURAL INDEPENDENCE Structuralindependence exists when the changes made to thedatabase structure does not affect the DBMSs ability toaccess data. Hierarchical database systems use physicalstorage paths to navigate to the different data segments.So the application programmer should have a goodknowledge of the relevant access paths to access the data.So if the physical structure is changed the applications willalso have to be altered. Thus, in a hierarchical database

    the benefits of data independence are limited by structuraldependence.

    PROGRAMMING COMPLEXITY Due to the structuraldependence and the navigational structure, the applicationprogrammers and the end users must know precisely howthe data is distributed physically in the database in orderto access data. This requires knowledge of complex pointer

    16

  • 8/8/2019 Database Book Report

    17/26

    system, which is difficult for users who have little or noprogramming knowledge.

    IMPLEMENTATION LIMITATION Many of the commonrelationships do not conform to the 1:n format required bythe hierarchical model. The many-to-many(n:n)

    relationships, which are more common in real life are verydifficult to implement in a hierarchical model.

    The Relational Model uses tables to represent the data and therelationships among those data. Each table has multiple columns, andeach column is identified by a unique name. Figure 2.2 shows a samplerelational database comprising two tables: one show students details,and the other shows the balance fee they need to deposit.Student_name

    Student_address

    Student_city

    Student_course

    Student_number

    Smith South Knott Anaheim B.Tech S-111

    Williams Laguna SanFrancisco B. Sc. S-211

    Student_number Fee_balanceS-111 27500S-211 18900

    Figure 2.2 A Sample Relational DatabaseIn the sample database of Figure 2.2, each row in the table

    represents a different student. Relationships link rows from two tableson the basis of the key field, in this case-student number. The mainadvantages of the relational model are:

    STRUCTURAL INDEPENDENCE Relation database model hasstructural independence, i.e. changes made in thedatabase structure does not affect the DBMSs capability toaccess data. The database users are oblivious of thedetails of the data storage because the relational modeldoes not depend on the navigational data access system.

    SIMPLICITYThe relational model is the simplest model atthe conceptual level. It allows the designer to concentrateon the logical view of the database, leaving the physicaldata storage details.

    EASE OF DESIGNING, IMPLEMENTATION, MAINTENANCE

    AND USAGE Due to the inherent features of dataindependence and structural independence, the relationalmodel makes it easy to design, implement, maintain anduse the databases.

    The disadvantage are:

    17

  • 8/8/2019 Database Book Report

    18/26

    HARDWARE OVERHEADS The RDBMS needs comparativelypowerful hardware as it hides the implementationcomplexities and the physical data storage details from theusers. With the modern day computer, increasedprocessing power is not a big issue.

    EASE OF DESIGN CAN RESULT IN BAD DESIGN As therelational database is an easy-to-design and use system, itcan result in the development and implementation ofpoorly designed database management systems. As thesize of the database increases, several problems maycreep insystem slowdown, performance degradation anddata corruption.

    INFORMATION ISLAND PHENOMENON As the relationdatabase systems are easy to use and implement, peopleor departments may create their own databases and

    applications. This situation might hinder informationintegration that is necessary for the smooth and efficientfunction of the organization. Problems like datainconsistency, data duplication and data redundancy mayalso crop up.

    All the issues given above can be avoided if the organizationenforces good database standards and has a properly designeddatabase. The disadvantages of relational database model are less ascompared to its advantages and capabilities. One can do away with thedrawbacks stated below by properly implementing the database

    model. Relational database technology is not good at handling theneed of complex information systems. Relational database design isreally a process of trying to figure out how to represent real-worldobjects within the confines of tables ins such a way that goodperformance results and maintaining data integrity is possible. Objectdatabase design is quite difference. For the most part, object databasedesign is a fundamental part of the overall application design process.The object classes used by the programming language are the classesused by the ODBMS. Because of their models are consistent, there isno need to transform the programs object model to something uniquefor the database manager. A data model is a collection of

    mathematically well-defined concepts that enable one to consider andexpress the static and dynamic properties of data intensiveapplications. A data model consist of:

    Static properties such as objects, attributes andrelationships;

    Integrity rules over objects and operation; and

    18

  • 8/8/2019 Database Book Report

    19/26

    Dynamic properties such as operations or rules definingnew database states based on applied state changes.

    Object-oriented model represents an entity as a class. A classrepresents both entity attributes as well as the behavior of the entity.

    For example, book class will have not only the book attributes such asISBN, Title, Author, Publisher, Year of Publishing, Distributor, Price, etc.but also procedures that specify actions expected of a book such asUpdatePrice(updating the price). Instances of the class objectcorrespond to individual books. Within an object the class attributestakes specific values, which distinguish one book(object) from another.However, the behavior patterns of the class is shared by all the objectsbelonging to the class.

    Relational database management is one of the most successfultechnologies in computer science. A lot of money is spent each year on

    relational database systems and applications, and much of the worldsbusiness data is stored in relational database systems andapplications, and much of the worlds business data is stored inrelational form. Until recently, most of the individual data items storedin relational database were relatively small and simple. For storingthese simple data items, database systems supported a set ofpredefined data types such as integers, real numbers, and characterstrings. The operations defined over these data types, such asarithmetic and comparison were also simple and predefined.Increasingly, modern database applications need to store andmanipulate objects that are neither small nor simple, and to perform

    operations on these objects that are not predefined. For example, theplanning department of a city might need to store maps, photographs,written documents with diagrams, and audio and video recordings. Aplanner might need to find some information. Multimedia applicationsare among the fastest-growing segments of the database industry, andbecause of the very large amounts of data that they require, we canexpect the requirements of theses applications to become increasinglyimportant. The traditional data types and search capabilities of SQL arenot sufficient for the new generation of multimedia databaseapplications. But it is also clear that the requirements of theseapplications are so diverse that they cannot be satisfied by any set of

    predefined language extensions. What we need is not a collection ofnew data types and functions, but a facility that lets users define newdata types and functions of their own. Another requirement of modernapplications is databases that not only store data but also record andenforce business rules that apply to the data. Associating rules withdata makes that data more active, enabling the database system toperform automatic validity checks and to automate many businessprocedures. Making data active increases its value because it enables

    19

  • 8/8/2019 Database Book Report

    20/26

    applications to share not only the data itself but also the behavior ofthe data. Rules are stored in the database rather than being encodedin each application, so redundancy is eliminated and the integrity ofthe data is protected. Allowing users to define their own data typesand functions, allowing users to define rules that govern the behavior

    of active data, are both ways of increasing the value of stored data byincreasing its semantic content. The trend toward increasing thesemantic content stored data is the most important trend in databasemanagement today. In order to accommodate and facilitate this trend,relational database systems are being enhanced in two ways:

    By adding an object infrastructure to the database systemitself, in the form of support for user-defined data types,functions and rules.

    By building relational extenders on top of thisinfrastructure that support specialized applications such as

    image retrieval, advanced text searching and geographicapplications.

    A system that includes both object infrastructure and a set ofrelational extenders that exploit it is called an Object RelationalDatabase System. Object-relational systems combine theadvantages of modern object-oriented programming languages withrelational database features such as multiple views of data and a high-level, nonprocedural query language. An object-relational system is agood long-term investment because its extenders provide thecapabilities you need to manage todays specialized objects, and its

    object infrastructure give you the ability to define new types, functions,and rules to deal with the evolving needs of your business. Some of theobject-relational systems available in the market are IBMs DB2Universal Servers. Oracle Corporations Oracle 8, Microsoft CorporationsSQL Server 7, and so on.

    2.3 Schemas, Instances and States

    Database Schema is the description of a database. Schema isalso called intension. It is the overall design(logical structure) of thedatabase. It changes very infrequently. It includes descriptions of thedatabase structure, data types, and the constraints on the database.Example: The database consists of information about a set ofcustomers and accounts and the relationship between them).

    STUDENT

    20

    Figure 2.3

    Database Schema

  • 8/8/2019 Database Book Report

    21/26

    Name Student_number Class Major

    COURSECourse_name Course_number Credit_hours Department

    PREREQUISITE

    Course_number Prerequisite_number

    SECTIONSection_identifier Course_number Semester Year Instructor

    GRADE_REPORTStudent_number Section_identifier Grade

    Schema Diagram is an illustrative display of(most aspects of) adatabase schema.

    Schema Construct is a component of the schema or an object

    within the schema, e.g., STUDENT, COURSE

    Database State is the actual data stored in a database at aparticular moment in time. It is also called extension. State includesthe collection of all the data in the database. It is also called databaseinstance(or occurrence or snapshot). The term instance is also appliedto individual database components, e.g. record instance, tableinstance, entity instance, entity instance.

    COURSECourse_name Course_numbe

    r

    Credit_hour

    s

    Departmen

    tIntro to Computer Science CSI3I0 4 CSData Structures CS3320 4 CSDiscrete Mathematics MATH2440 3 MATHDatabase CS3380 3 CS

    SECTIONSection_identifier

    Course_number

    Semester

    Year Instructor

    85 MATH2410 Fall 2009 King92 CS1310 Fall 2009 Anderson102 CS3320 Spring 2010 Knuth112 MATH2410 Fall 2010 Chang

    119 CS1310 Fall 2010 Anderson135 CS3380 Fall 2010 Stone

    GRADE_REPORTStudent_number

    Section_identifier

    Grade

    17 112 B17 119 C8 85 A

    21

    Figure 2.4 Database State

  • 8/8/2019 Database Book Report

    22/26

    8 92 A8 102 B8 135 A

    PREREQUISITECourse_numbe

    r

    Prerequisite_numb

    erCS33380 CS33320CS3380 MATH2410CS3320 CS1310

    Database State refers to the content of a database at amoment in time. Initial Database State refers to the database statewhen it is initially loaded into the system. Valid state is a state thatsatisfies the structure and constraints of the database. Database statechanges every time the database is updated.

    2.4 Three-Schema Architecture

    Three-Schema Architecture was proposed to support DBMScharacteristics of program-data independence and multiple views ofthe data. It was not explicitly used in commercial DBMS products, buthas been useful in explaining database system organization. It defineDBMS schemas at three levels:

    Internal schema at the internal level to describe physicalstorage structures and access paths(e.g. indexes).Typically uses a physical data model.

    Conceptual schema at the conceptual level to describe thestructure and constraints for the whole database for acommunity of users. It uses a conceptual or animplementation data model.

    External schemas at the external level to describe thevarious user views. It usually uses the same data model asthe conceptual schema.

    22

  • 8/8/2019 Database Book Report

    23/26

    Figure 2.5 The Three-Schema Architecture

    Mapping among schema levels are needed to transform requestsand data. Programs refer to an external schema, and are mapped bythe DBMS to the internal schema for execution. Data extracted fromthe internal DBMS level is reformatted to match the users externalview(e.g. formatting the results of an SQL query for display in a webpage).

    2.5 Data Independence

    The ability to modify a scheme definition in one level withoutaffecting a scheme definition in a higher level is called data

    independence. There are two level of data independence.

    Physical Data Independence is the ability to modify thephysical scheme without causing application programs to berewritten. Modifications at this level necessary to improve theperformance

    23

    External

    ViewExternal

    View

    External

    Level

    External/Conce

    ptual MappingConceptual Level

    Conceptual/Internal

    Mapping

    Internal

    Level

    Stored Database

  • 8/8/2019 Database Book Report

    24/26

    Logical Data Independence is the ability to modify theconceptual /Logical schema without causing applicationprograms to be rewritten or any change to external schema.Usually done when logical structure of database is altered suchas adding or removing new entities, attributes or relationship.

    2.6 DBMS Languages and Interfaces

    Data Definition Language (DDL) defines the logical schema(relations, views etc)and storage schema stored in a Data Dictionary.Data Manipulation Language(DML) is a family of computerlanguages used by computer programs and/or database users toinsert, delete and update data in a database. Read-only querying, i.e.

    SELECT, of this data may be considered to be either part of DML oroutside it, depending on the context. High level or non procedurallanguages include relational language SQL. It may be used in astandalone way or may be embedded in a programming language.High Level Language are set-oriented and specify what data toretrieve rather than how to retrieve it. For example, the SQL relationallanguage. High Level Languages are also called declarative languages.Low level or procedural languages must be embedded in aprogramming language. Retrieve data one record-at-a-time. Low levellanguages needed constructs to retrieve multiple records, along withpositioning pointers. Data definition Language(DDL) is used by the

    DBA and database designers to specify the conceptual schema of adatabase. In many DBMSs, the DDL is also used to define internal andexternal schemas(views). In some DBMSs, separate storage definitionlanguage(SDL) and view definition language(VDL) are used to defineinternal and external schemas. SDL is typically realized via DBMScommands provided to the DBA and database designers.

    2.7 Database System Utilities and Tools

    Database System Utilities perform certain functions such as:

    Loading data stored in files into a database. Includesdata conversion tools.

    Backing up the database periodically on tape. Reorganizing database file structures.

    Report generation utilities.

    24

  • 8/8/2019 Database Book Report

    25/26

    Other functions such as sorting, user monitoring, datacompression, etc.

    Other tools Data dictionary/repository is used to store schema

    descriptions and other information such as designdecisions, application program descriptions, userinformation, usage standards, etc.

    Active data dictionary is accessed by DBMS softwareand users/DBA.

    Passive data dictionary is accessed by users/DBA only. Application Development Environments and

    CASE(Computer Aided Software Engineering) such asPowerBuilder(Sybase), JBuilder(Borland), JDeveloper10G(Oracle)

    2.8 Centralized and Client-Server Architectures

    Centralized DBMS combines everything into single systemincluding DBMS software, hardware, application programs, and userinterface processing software. User can still connect through a remoteterminal. However, all processing is done at centralized site.

    Two-tier architecture

    The 2-tier model is more simple, but more limited, than a 3-tiermodel, and often includes The database management system (DBMS) The main application software, including GUI

    Here, the entire application is generally run on the client machineIn some contexts, the 2-tier model is also know as the client-server

    model, where the server can be something other than a database

    E.g. client programs using ODBC/JDBC to communicate with adatabase

    Advantages of Two-Tier Approach

    Clients do not have to be as powerful

    Greatly reduces data traffic on the network

    Improved data integrity since it is all processed centrally

    25

  • 8/8/2019 Database Book Report

    26/26

    Limitationso Performance deteriorates if number of users is greater than 100o Restricted flexibility and choice of DBMS, since data language

    used in server is proprietary to each vendoro Limited functionality in moving program functionality across

    servers

    Three-tier architecture

    A three-tier architecture is one which has a client tier, a middletier, and a database tier. The database tier manages the database The middle tier contains most of the logic and communicates

    between the other tiers The client tier is the interface between the user and the system

    E.g. web-based applications and applications built using

    middleware

    Advantages of Three-Tier Architectures

    Scalability

    Technological flexibility

    Long-term cost reduction

    Better match of systems to business needs

    Improved customer service

    Reduced risk

    Challenges of Three-tier Architectureso High short-term costso Tools and trainingo Experienceo Incompatible standardso Lack of compatible end-user tools

    2.9 Classification of DBMSs

    Data model has classifications based on the data model usedsuch as:

    Traditional: Relational, Network, Hierarchical.

    Emerging: Object-oriented, Object-relational.

    Single-User(typically used with personal computers)versus multi-user(most DBMSs).

    26