principles and practice of database systems - springer978-1-349-17958-9/1.pdf · 1.2 institutions...

13
Principles and Practice of Database Systems

Upload: truongthien

Post on 31-Jan-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Principles and Practice of Database Systems - Springer978-1-349-17958-9/1.pdf · 1.2 Institutions involved with database development 1.3 Concepts and definitions ... 9.4 Cobol subschema

Principles and Practice of Database Systems

Page 2: Principles and Practice of Database Systems - Springer978-1-349-17958-9/1.pdf · 1.2 Institutions involved with database development 1.3 Concepts and definitions ... 9.4 Cobol subschema

Macmillan Computer Science Series

Consulting EditorProfessor F. H. Sumner, University of Manchester

S. T. Allworth, Introduction to Real-time Software DesignIan O. Angell, A Practical Introduction to Computer GraphicsR. E. Berry and B. A. E. Meekings, A Book on CG. M. Birtwistle, Discrete Event Modelling on SimulaT. B. Boffey, Graph Theory in Operations ResearchRichard Bornat, Understanding and Writing CompilersJ. K. Buckle, The ICL 2900 SeriesJ. K. Buckle, Software Configuration ManagementJ. C. Cluley, Interfacing to MicroprocessorsRobert Cole, Computer CommunicationsDerek Coleman, A Structured Programming Approach to DataAndrew J. T. Colin, Fundamentals ofComputer ScienceAndrew J. T. Colin, Programming and Problem-solving in Algol 68S. M. Deen, Principles and Practice ofDatabase SystemsP. M. Dew and K. R. James, Introduction to Numerical Computation in PascalM. R. M. Dunsmuir and G. J. Davies, Programming the UNIX SystemK. C. E. Gee, Introduction to Local Area Computer NetworksJ. B. Gosling, Design ofArithmetic Units for Digital ComputersRoger Hutty, Fortran for StudentsRoger Hutty, Z80 Assembly Language Programming for StudentsRoland N. Ibbett, The Architecture ofHigh Performance ComputersP. Jaulent, The 68000 - Hardware and SoftwareM. J. King and J. P. Pardoe, Program Design UsingJSP - A Practical IntroductionH. Kopetz, Software Reliability ,E. V. Krishnamurthy, Introductory Theory ofComputer ScienceGraham Lee, From Hardware to Software: an introduction to computersA. M. Lister, Fundamentals ofOperating Systems, third editionG. P. McKeown and V. J. Rayward-Smith,Mathematicsfor ComputingBrian Meek, Fortran, PL/1 and the AlgolsDerrick Morris, An Introduction to System Programming - Based on the PDP11Derrick Morris and Roland N. Ibbett, The MU5 Computer SystemC. Queinnec, LISPJohn Race, Case Studies in Systems AnalysisW. P Salman, O. Tisserand and B. Toulout, FORTHL. E. Scales, Introduction to Non-Linear OptimizationP. S. Sell, Expert Systems - A Practical IntroductionColin J. Theaker and Graham R. Brookes, A Practical Course on Operating SystemsM. J. Usher, Information Theory for Information TechnologistsB. S. Walker, Understanding MicroprocessorsPeter J. L. Wallis,Portable ProgrammingI. R. Wilson and A. M. Addyman, A Practical Introduction to Pascal - with BS

6192, second edition

Page 3: Principles and Practice of Database Systems - Springer978-1-349-17958-9/1.pdf · 1.2 Institutions involved with database development 1.3 Concepts and definitions ... 9.4 Cobol subschema

Principles andPractice of Database

Systems

S.M. Deen

Department of Computing ScienceUniversity of Aberdeen

MMACMILLAN

Page 4: Principles and Practice of Database Systems - Springer978-1-349-17958-9/1.pdf · 1.2 Institutions involved with database development 1.3 Concepts and definitions ... 9.4 Cobol subschema

© S. M. Deen 1985

All rights reserved. No reproduction, copy or transmissionof this publication may be made without written permission.

No paragraph of this publication may be reproduced, copiedor transmitted save with written permission or in accordancewith the provisions of the Copyright Act 1956 (as amended).

Any person who does any unauthorised act in relation tothis publication may be liable to criminal prosecution andcivil claims for damages.

First published 1985

Published byHigher and Further Education DivisionMACMILLAN PUBLISHERS LTDHoundmills, Basingstoke, Hampshire RG21 2XSand LondonCompanies and representativesthroughout the world

Printed in Great Britain byThe Camelot Press Ltd,Southampton

British Library Cataloguing in Publication Data

Deen, S. M.Principles and practice of database systems.­(Macmillan computer science series)1. Data base management 2. File organization(Computer science)I. Title001.64'42 QA76.9.D3

ISBN 978-0-333-37100-8 ISBN 978-1-349-17958-9 (eBook)

DOI 10.1007/978-1-349-17958-9

Page 5: Principles and Practice of Database Systems - Springer978-1-349-17958-9/1.pdf · 1.2 Institutions involved with database development 1.3 Concepts and definitions ... 9.4 Cobol subschema

Contents

[Advanced chapters and sections are marked with an asterisk *]

Preface

Acknowledgements

A Note to Tutors and Readers

PART 1: FOUNDATION

1 Introduction1.1 History of the evolution of databases1.2 Institutions involved with database development1.3 Concepts and definitions1.4 Facilities and limitationsReferences

2 Data and File Structure2.1 Basic concepts2.2 Indexed files2.3 Random files2.4 Inverted files2.5 Linked structuresExercisesReferences

*3 Pointers and Advanced Indexes3.1 Pointer types3.2 Pointer organisation3.3 Advanced indexing techniquesExercisesReferences

PART II: DESIGN ISSUES

4 Architectural Issues4.1 Three basic levels

v

ix

xi

xii

3367

1112

1313161924273737

383840455454

55

5757

Page 6: Principles and Practice of Database Systems - Springer978-1-349-17958-9/1.pdf · 1.2 Institutions involved with database development 1.3 Concepts and definitions ... 9.4 Cobol subschema

vi Contents

4.2 Entity characteristics 594.3 Data representation 614.4 Data placement and access 704.5 Database architecture 774.6 Data independence 84Exercises 87References 88

5 Supporting Facilities5.1 Concurrent usage5.2 Data protection5.3 Modification and optimisation5.4 Data dictionary5.5 Languages and utilities5.6 Database control system*5.7 Database recovery5.8 ConclusionExercisesReferences

PART III: RELATIONAL AND NETWORK APPROACHES

6 Basic Relational Model6.1 Basic concepts6.2 Normalisation6.3 Data manipulation6.4 Implementation issuesExercisesReferences

*7 Higher Normal Fonns7.1 Fourth normal form (4NF)7.2 An ultimate normal formExercisesReferences

*8 Relational Languages8.1 Structural Query Language (SQL)8.2 Query-By-Example (QBE)8.3 Tuple versus domain calculus8.4 Implementation of relational operationsExercisesReferences

8989

101108112114116122129130130

133

135135139151169172174

176176184188189

190190202209211217221

Page 7: Principles and Practice of Database Systems - Springer978-1-349-17958-9/1.pdf · 1.2 Institutions involved with database development 1.3 Concepts and definitions ... 9.4 Cobol subschema

Contents

9 Codasyl Model9.1 Codasyl9.2 Basic concepts9.3 Codasyl schema model9.4 Cobol subschema model9.5 Cobol data manipulation facilities9.6 Other characteristicsExercisesReferences

*10 Codasyl Model Revisited10.1 Storage schema10.2 Fortran database facility10.3 Past changes10.4 Evaluation of the Codasyl modelExercisesReferences

*11 Standards and Equivalence11.1 ANSI network model11.2 ANSI relational model11.3 Equivalence of the two modelsExercisesReferences

PART IV: THE STATE OF THE ART

12 User Issues in Implementation12.1 Database environment12.2 Database selectionExercisesReferences

*13 Prototypes and Earlier Models13.1 SYSTEM R13.2 Hierarchical model (IMS)13.3 Inverted and net structured models13.4 Summary and conclusionExercisesReferences

vii

222222224227241246259260262

263263267268272276276

277277289290296296

299

301301306309309

310310315325331332333

Page 8: Principles and Practice of Database Systems - Springer978-1-349-17958-9/1.pdf · 1.2 Institutions involved with database development 1.3 Concepts and definitions ... 9.4 Cobol subschema

viii Contents

PART V: FUTURE DEVELOPMENTS

*14 New Approaches14.1 Data modelling14.2 End-user facilities14.3 Database machines14.4 New applicationsExercisesReferences

*15 Distributed Databases15.1 Multi-level control15.2 Nodal variations15.3 User facilities15.4 Privacy, integrity, reliability15.5 Architectural issuesExercisesReferences

Hints and Answers to Selected Exercises

Index

335

337337344347350354354

357358361363365367373373

375

388

Page 9: Principles and Practice of Database Systems - Springer978-1-349-17958-9/1.pdf · 1.2 Institutions involved with database development 1.3 Concepts and definitions ... 9.4 Cobol subschema

Preface

This book was originally intended as a new edition of my earlier book FundamentalsofDataBaseSystems, published several years ago. But, in the intervening period,so much has changed in the field of databases that I thought it better to write anentirely new book - with a new framework to incorporate the change. However,like its predecessor, this book also draws heavily from my undergraduate course atAberdeen, which has now been given for over ten years. A major source of contribu­tion to my course, and more importantly to the contents of this book, has been thedeliberations of the British Database Teachers Conferences, which I helped toorganise in the late seventies and early eighties, with a view to improving the stan­dard of database teaching in the U.K. I have also taken into consideration thecomments and suggestions received from the readers of my earlier book. An areato which I have paid special attention is the need of potential research students foran advanced treatment of some database issues at the undergraduate level, as partof a general background. This is an open-ended area with considerable scope fordisagreements as to what should or should not be included. In this I have been in­fluenced by my involvements in a number of international conferences on databaseresearch, and by my experience as a supervisor of research students. Thus the bookis intended as a general text on databases for undergraduate students and otherin terested readers, but has been extended to include some advanced topics forthose who need them.

The book deals primarily with the general principles of databases, using therelational and network models as the main vehicle for the realisation of theseprinciples. Its basic objective is to impart a deeper understanding of the issues andproblems, rather than to produce database programmers. To reinforce what islearnt, exercises with selected answers are provided at the end of each chapter(except for chapter 1). The references cited can be looked up for further studies.

The book is divided into five parts, with Part I (chapters 1-3) presenting a pre­requisite for the appreciation of the main database issues. Part II (chapters 4 and 5)covers the issues and problems, and thus constitutes the kernel of the book. Part IIIis devoted to the relational and network models and is spread over six chapters(6-11). A rationale of this part could be helpful.

I have presented the first three normal forms and the original relational languages(algebra, calculus, Alpha) as part of a basic (or classical) relational model in chapter6. Needless to say, a sound knowledge of a relational algebra is essential for aproper understanding of the relational model, and Codd's calculus and Alpha are

ix

Page 10: Principles and Practice of Database Systems - Springer978-1-349-17958-9/1.pdf · 1.2 Institutions involved with database development 1.3 Concepts and definitions ... 9.4 Cobol subschema

x Preface

the principal sources of all subsequent calculus based languages. QUEL is includedin chapter 6 as the closest implementation of Alpha. Admittedly SQL is the leadingrelational language, but I found it more convenient to present it in chapter 8 withQBE, rather than packing it in an already large chapter 6. Chapter 7 on highernormal forms stands on its own.

The Codasyl model is described in chapter 9 in fair detail. In keeping with theobjective of this book, I have stressed the semantics more than the syntaxes ­syntaxes are given mainly for a more concrete understanding of the concepts in­volved. An overview of the storage schema in chapter 10 outlines a possible storagemodel for the Codasyl schema, and also shows where the storage dependent clausesof the earlier schema models have been moved to. The Fortran subschema facilityin chapter 10 illustrates the mechanism of supporting a different host language onthe Codasyl model. Since the implemented Codasyl packages are based on theearlier models, the section on the past changes provides a link between the latestversion and the one that the user may still have in his computer centre.

Obviously the ANSI network model presented in chapter 11 will eventuallyreplace the Codasyl model. Additionally, it offers some new ideas on database/hostlanguage interfaces. The same chapter also carries a comparison between the rela­tional and the network models, which attempts to fulfil a need recognised by manyteachers.

Part IV (chapters 12 and 13) describes the current state of the art. Chapter 12reviews user issues, while chapter 13 discusses some special products, includingSYSTEMR which demonstrates many of the pioneering implementation techniques ofthe relational model. The other special products described there represent the earlierdata models and hence their coverage highlights their modelling powers, rather thantheir supporting facilities. Since the hierarchical model (IMS) is regarded by manyexperts as the third major data model (after the relational and network models), Ihave described it more fully and also compared its modelling capacity with that ofthe network model.

The final Part V (chapters 14 and 15) attempts to provide a glimpse of the future,and includes such topics as I thought useful and exciting.

Department ofComputing ScienceUniversity ofAberdeen

S. M. Deen

Page 11: Principles and Practice of Database Systems - Springer978-1-349-17958-9/1.pdf · 1.2 Institutions involved with database development 1.3 Concepts and definitions ... 9.4 Cobol subschema

Acknowledgements

I wish to thank a number of people who assisted me during the writing of this book.To Ray Carrick, John Edgar, Don Kennedy and Malcolm Taylor ofmy PRECI researchgroup at Aberdeen, for going through the manuscripts and for suggesting correc­tions. To David Bell (University of Ulster), Bill Samson (Dundee College ofTechnology), Stan Blethyn (Bristol Polytechnic) and the publisher's referee forreviewing the material and for their valuable comments. Stan Blethyn also suppliedexamination questions, some of which are included in the exercises. To FrankManola (Computer Corporation of America, U.S.A.) for keeping me informed onthe latest developments on the ANSI data models.

I am also thankful to Muriel Jones of IBM, Phillip Robling of Cincom andGeoff Stacey of Edinburgh University for reviewing the coverages on IMS, TOTALand Codasyl DSDL respectively. My special thanks are to my secretary, SheilaTomlins, for her painstaking typing and retypings of the manuscript. She was atremendous help. .

Finally, I take this opportunity also to express my indebtedness to my familyfor the support I received. To my wife Rudaina for her patience and encourage­ment and to two young men, Rami (12) and Alvin (10), for leaving their dad aloneeven when their computer programs needed debugging.

xi

Page 12: Principles and Practice of Database Systems - Springer978-1-349-17958-9/1.pdf · 1.2 Institutions involved with database development 1.3 Concepts and definitions ... 9.4 Cobol subschema

A Note to Tutors and Readers

As pointed out in the Preface, this book is aimed at undergraduate students andother interested readers such as programmers, systems designers and databaseadministrators. The reader is assumed to possess some knowledge of computersand of Cobol as a general background, within which each topic in the book isdeveloped - starting from a basic level and gradually progressing towards deeperissues. The contents of the book can be logically divided into two parts, basic andadvanced. The chapters covering the basic material are

chapter 1chapter 2chapter 3chapter 4chapter 5 (except section 5.7 which is advanced)chapter 6chapter 9chapter 12

(advanced chapters are marked with an asterisk * in the Contents list).

It should be possible to understand the basic material, by-passing the advancedparts, without any serious loss of continuity. Within each chapter there are issueswhich are treated in some depth, but a reader not interested in these issues shouldbe able to skip over them, again without any significant loss of understanding ofthe rest of the material. Every basic chapter is largely self-contained; most ofchapters 6 and 9 (the relational and the Codasyl models) can be understood at apreliminary level if - apart from chapters 1 and 2 - only sections 4.1, 4.2 and 4.3are read beforehand. However, a more thorough understanding of relevant databaseissues requires the study of both chapters 4 and 5 in full.

For an introductory course on the relational model it should not be necessaryto cover functional dependencies and determinants, since both the second andthird normal forms have been given alternative deflnitions. Likewise, quadraticjoin, outer join and QUEL can also be skipped if so desired. The students can beasked to read relational calculus and DSL Alpha for themselves.

If a briefer coverage is intended for the Codasyl model, then exclude the CALL,CHECK and SOURCE/RESULT clauses, the STRUCTURAL constraint clause and

xii

Page 13: Principles and Practice of Database Systems - Springer978-1-349-17958-9/1.pdf · 1.2 Institutions involved with database development 1.3 Concepts and definitions ... 9.4 Cobol subschema

A Note to Tutors xiii

the multimember set types. Also restrict the set order criteria to the DEFAULToption and set selection criteria to the current of set types. Correspondingly, theAlternate clause, the set selection criteria and the subschema working storagesection can be dropped from the subschema. Chapter 12 is written mainly forbeginners, to give them some notion of the database environments. It is suitablefor self-reading by the students.

For advanced studies, chapter 3 should be covered along with other chapters asneeded. Much of chapter 10, and the ANSI network and relational models inchapter 11, can be given to the advanced students to read for themselves, with aview to carrying out some of the exercises at the end of those chapters.

Many of the exercises in the book are drawn from Honours degree examinationquestions. Readers' attention is drawn particularly to exercise 6.9 of chapter 6,which includes a cyclic query, exercise 8.8 of chapter 8, which shows a limitationof the relational model, and exercise 9.12 of chapter 9, which illustrates the use ofcurrency indicators. There are also three special case studies presented in exercises8.7,8.8 and 8.9 of chapter 8 for relational solutions, and in 9.9, 9.10 and 9.11 ofchapter 9 for Codasyl solutions. These case studies help to bring out the compara­tive features of the two models.

Readers are also asked to have a look at the Preface, which explains the generallayout of the book.