© 2009 pearson education, inc. publishing as prentice hall 1 chapter 13: data and database...
TRANSCRIPT
© 2009 Pearson Education, Inc. Publishing as Prentice © 2009 Pearson Education, Inc. Publishing as Prentice HallHall 11
Chapter 13:Chapter 13: Data and Database Data and Database
AdministrationAdministrationModern Database ManagementModern Database Management
99thth Edition EditionJeffrey A. Hoffer, Mary B. Prescott, Jeffrey A. Hoffer, Mary B. Prescott,
Heikki TopiHeikki Topi
Chapter 13 22
ObjectivesObjectives• Definition of termsDefinition of terms• List functions and roles of data/database administrationList functions and roles of data/database administration• Describe role of data dictionaries and information Describe role of data dictionaries and information
repositoriesrepositories• Compare optimistic and pessimistic concurrency controlCompare optimistic and pessimistic concurrency control• Describe problems and techniques for data securityDescribe problems and techniques for data security• Describe problems and techniques for data recoveryDescribe problems and techniques for data recovery• Describe database tuning issues and list areas where changes Describe database tuning issues and list areas where changes
can be done to tune the databasecan be done to tune the database• Describe importance and measures of data availabilityDescribe importance and measures of data availability
Chapter 13 33
Traditional Administration DefinitionsTraditional Administration Definitions
• Data AdministrationData Administration:: A high-level function that is A high-level function that is responsible for the overall management of data responsible for the overall management of data resources in an organization, including maintaining resources in an organization, including maintaining corporate-wide definitions and standardscorporate-wide definitions and standards
• Database AdministrationDatabase Administration:: A technical function that A technical function that is responsible for physical database design and for is responsible for physical database design and for dealing with technical issues such as security dealing with technical issues such as security enforcement, database performance, and backup enforcement, database performance, and backup and recoveryand recovery
Chapter 13 44
Traditional Data Administration Traditional Data Administration FunctionsFunctions
• Data policies, procedures, standardsData policies, procedures, standards• PlanningPlanning• Data conflict (ownership) resolutionData conflict (ownership) resolution• Managing the information repositoryManaging the information repository• Internal marketing of DA conceptsInternal marketing of DA concepts
Chapter 13 55
Traditional Database Traditional Database Administration FunctionsAdministration Functions
• Selection of DBMS and software toolsSelection of DBMS and software tools• Installing/upgrading DBMSInstalling/upgrading DBMS• Tuning database performanceTuning database performance• Improving query processing performanceImproving query processing performance• Managing data security, privacy, and integrityManaging data security, privacy, and integrity• Data backup and recoveryData backup and recovery
Chapter 13 66
Evolving Approaches to Data Evolving Approaches to Data AdministrationAdministration
• Blend data and database administration into one roleBlend data and database administration into one role• Fast-track development–monitoring development process Fast-track development–monitoring development process
(planning, analysis, design, implementation, maintenance)(planning, analysis, design, implementation, maintenance)• Procedural DBAs–managing quality of triggers and stored Procedural DBAs–managing quality of triggers and stored
proceduresprocedures• eDBA–managing Internet-enabled database applicationseDBA–managing Internet-enabled database applications• PDA DBA–data synchronization and personal database PDA DBA–data synchronization and personal database
managementmanagement• Data warehouse administrationData warehouse administration
Chapter 13 77
Data Warehouse AdministrationData Warehouse Administration• New role, coming with the growth in data New role, coming with the growth in data
warehouseswarehouses• Similar to DA/DBA rolesSimilar to DA/DBA roles• Emphasis on integration and coordination of Emphasis on integration and coordination of
metadata/data across many data sourcesmetadata/data across many data sources• Specific roles:Specific roles:
– Support DSS applicationsSupport DSS applications– Manage data warehouse growthManage data warehouse growth– Establish service level agreements regarding data Establish service level agreements regarding data
warehouses and data martswarehouses and data marts
Chapter 13 88
Open Source DBMSsOpen Source DBMSs• An alternative to proprietary packages such as Oracle, An alternative to proprietary packages such as Oracle,
Microsoft SQL Server, or Microsoft AccessMicrosoft SQL Server, or Microsoft Access• mySQL is an example of an open-source DBMSmySQL is an example of an open-source DBMS• Less expensive than proprietary packagesLess expensive than proprietary packages• Source code available, for modificationSource code available, for modification• Absence of complete documentationAbsence of complete documentation• Ambiguous licensing concernsAmbiguous licensing concerns• Not as feature-rich as proprietary DBMSsNot as feature-rich as proprietary DBMSs• Vendors may not have certification programsVendors may not have certification programs
Chapter 13 99
Figure 13-2 Data modeling responsibilities
Chapter 13 1010
Database SecurityDatabase Security
• Database Security:Database Security: Protection of the Protection of the data against accidental or intentional data against accidental or intentional loss, destruction, or misuseloss, destruction, or misuse
• Increased difficulty due to Internet Increased difficulty due to Internet access and client/server technologiesaccess and client/server technologies
Chapter 13 1111
Figure 13-3 Possible locations of data security threats
Chapter 13 1212
Threats to Data SecurityThreats to Data Security• Accidental losses attributable to:Accidental losses attributable to:
– Human errorHuman error– Software failureSoftware failure– Hardware failureHardware failure
• Theft and fraudTheft and fraud• Improper data access:Improper data access:
– Loss of privacy (personal data)Loss of privacy (personal data)– Loss of confidentiality (corporate data)Loss of confidentiality (corporate data)
• Loss of data integrityLoss of data integrity• Loss of availability (through, e.g. sabotage)Loss of availability (through, e.g. sabotage)
Chapter 13 1313
Figure 13-4 Establishing Internet Security
Chapter 13 1414
Web SecurityWeb Security• Static HTML files are easy to secureStatic HTML files are easy to secure
– Standard database access controlsStandard database access controls– Place Web files in protected directories on serverPlace Web files in protected directories on server
• Dynamic pages are harderDynamic pages are harder– Control of CGI scriptsControl of CGI scripts– User authenticationUser authentication– Session securitySession security– SSL for encryptionSSL for encryption– Restrict number of users and open portsRestrict number of users and open ports– Remove unnecessary programs Remove unnecessary programs
Chapter 13 1515
W3C Web Privacy StandardW3C Web Privacy Standard• Platform for Privacy Protection (P3P) Platform for Privacy Protection (P3P) • Addresses the following:Addresses the following:
– Who collects dataWho collects data– What data is collected and for what purposeWhat data is collected and for what purpose– Who is data shared withWho is data shared with– Can users control access to their dataCan users control access to their data– How are disputes resolvedHow are disputes resolved– Policies for retaining dataPolicies for retaining data– Where are policies kept and how can they be accessedWhere are policies kept and how can they be accessed
Chapter 13 1616
Database Software Security Database Software Security FeaturesFeatures
• Views or subschemasViews or subschemas• Integrity controlsIntegrity controls• Authorization rulesAuthorization rules• User-defined proceduresUser-defined procedures• EncryptionEncryption• Authentication schemesAuthentication schemes• Backup, journalizing, and checkpointingBackup, journalizing, and checkpointing
Chapter 13 1717
Views and Integrity ControlsViews and Integrity Controls
• ViewsViews– Subset of the database that is presented to one or more Subset of the database that is presented to one or more
usersusers– User can be given access privilege to view without User can be given access privilege to view without
allowing access privilege to underlying tablesallowing access privilege to underlying tables
• Integrity ControlsIntegrity Controls– Protect data from unauthorized useProtect data from unauthorized use– Domains–set allowable valuesDomains–set allowable values– Assertions–enforce database conditionsAssertions–enforce database conditions
Chapter 13 1818
Authorization RulesAuthorization Rules• Controls incorporated in the data management Controls incorporated in the data management
systemsystem• Restrict: Restrict:
– access to dataaccess to data– actions that people can take on dataactions that people can take on data
• Authorization matrix for:Authorization matrix for:– SubjectsSubjects– ObjectsObjects– ActionsActions– ConstraintsConstraints
Chapter 13 1919
Figure 13-5 Authorization matrix
Chapter 13 2020
Some DBMSs also provide capabilities for user-defined procedures to customize the authorization process
Figure 13-6a Authorization table for subjects (salespeople)
Figure 13-6b Authorization table for objects (orders)
Figure 13-7 Oracle privileges
Implementing authorization rules
Chapter 13 2121
Encryption – the coding or scrambling of data so that humans cannot read them
Secure Sockets Layer (SSL) is a popular encryption scheme for TCP/IP connections
Figure 13-8 Basic two-key encryption
Chapter 13 2222
Authentication SchemesAuthentication Schemes• Goal – obtain a Goal – obtain a positivepositive identification of the identification of the
useruser• Passwords: First line of defensePasswords: First line of defense
– Should be at least 8 characters longShould be at least 8 characters long– Should combine alphabetic and numeric dataShould combine alphabetic and numeric data– Should not be complete words or personal Should not be complete words or personal
informationinformation– Should be changed frequentlyShould be changed frequently
Chapter 13 2323
Authentication Schemes (cont.)Authentication Schemes (cont.)• Strong AuthenticationStrong Authentication
– Passwords are flawed:Passwords are flawed:• Users share them with each otherUsers share them with each other• They get written down, could be copiedThey get written down, could be copied• Automatic logon scripts remove need to explicitly type them inAutomatic logon scripts remove need to explicitly type them in• Unencrypted passwords travel the InternetUnencrypted passwords travel the Internet
• Possible solutions:Possible solutions:– Two factor–e.g. smart card plus PINTwo factor–e.g. smart card plus PIN– Three factor–e.g. smart card, biometric, PINThree factor–e.g. smart card, biometric, PIN– Biometric devices–use of fingerprints, retinal scans, etc. for Biometric devices–use of fingerprints, retinal scans, etc. for
positive IDpositive ID– Third-party mediated authentication–using secret keys, digital Third-party mediated authentication–using secret keys, digital
certificatescertificates
Chapter 13 2424
Security Policies and ProceduresSecurity Policies and Procedures
• Personnel controlsPersonnel controls– Hiring practices, employee monitoring, security trainingHiring practices, employee monitoring, security training
• Physical access controlsPhysical access controls– Equipment locking, check-out procedures, screen Equipment locking, check-out procedures, screen
placementplacement• Maintenance controlsMaintenance controls
– Maintenance agreements, access to source code, quality Maintenance agreements, access to source code, quality and availability standardsand availability standards
• Data privacy controlsData privacy controls– Adherence to privacy legislation, access rulesAdherence to privacy legislation, access rules
Chapter 13 2525
Database RecoveryDatabase Recovery
Mechanism for restoring a database quickly Mechanism for restoring a database quickly and accurately after loss or damageand accurately after loss or damage
Recovery facilities:Recovery facilities:• Backup FacilitiesBackup Facilities• Journalizing FacilitiesJournalizing Facilities• Checkpoint FacilityCheckpoint Facility• Recovery ManagerRecovery Manager
Chapter 13 2626
Back-up FacilitiesBack-up Facilities
• Automatic dump facility that produces Automatic dump facility that produces backup copy of the entire databasebackup copy of the entire database
• Periodic backup (e.g. nightly, weekly)Periodic backup (e.g. nightly, weekly)• Cold backup–database is shut down during Cold backup–database is shut down during
backupbackup• Hot backup–selected portion is shut down Hot backup–selected portion is shut down
and backed up at a given timeand backed up at a given time• Backups stored in secure, off-site locationBackups stored in secure, off-site location
Chapter 13 2727
Journalizing FacilitiesJournalizing Facilities• Audit trail of transactions and database Audit trail of transactions and database
updatesupdates• Transaction log–record of essential data for Transaction log–record of essential data for
each transaction processed against the each transaction processed against the databasedatabase
• Database change log–images of updated Database change log–images of updated datadata– Before-image–copy before modificationBefore-image–copy before modification– After-image–copy after modificationAfter-image–copy after modification
Produces an audit trailaudit trail
Chapter 13 2828
Figure 13-9 Database audit trail
From the backup and logs, databases can be restored in case of damage or loss
Chapter 13 2929
Checkpoint FacilitiesCheckpoint Facilities
• DBMS periodically refuses to accept new DBMS periodically refuses to accept new transactionstransactions
• system is in a system is in a quietquiet state state• Database and transaction logs are Database and transaction logs are
synchronizedsynchronized
This allows recovery manager to resume processing from short period, instead of repeating entire day
Chapter 13 3030
Recovery and Restart ProceduresRecovery and Restart Procedures
• Disk Mirroring–switch between identical copies of Disk Mirroring–switch between identical copies of databasesdatabases
• Restore/Rerun–reprocess transactions against the Restore/Rerun–reprocess transactions against the backupbackup
• Transaction Integrity–commit or abort all Transaction Integrity–commit or abort all transaction changestransaction changes
• Backward Recovery (Rollback)–apply before Backward Recovery (Rollback)–apply before imagesimages
• Forward Recovery (Roll Forward)–apply after Forward Recovery (Roll Forward)–apply after images (preferable to restore/rerun)images (preferable to restore/rerun)
Chapter 13 3131
Transaction ACID PropertiesTransaction ACID Properties• AtomicAtomic
– Transaction cannot be subdividedTransaction cannot be subdivided• ConsistentConsistent
– Constraints don’t change from before transaction to after Constraints don’t change from before transaction to after transactiontransaction
• IsolatedIsolated– Database changes not revealed to users until after Database changes not revealed to users until after
transaction has completedtransaction has completed• DurableDurable
– Database changes are permanentDatabase changes are permanent
Chapter 13 3232
Figure 13-10 Basic recovery techniques a) Rollback
Chapter 13 3333
Figure 13-10 Basic recovery techniques (cont.)b) Rollforward
Chapter 13 3434
Database Failure ResponsesDatabase Failure Responses• Aborted transactionsAborted transactions
– Preferred recovery: rollbackPreferred recovery: rollback– Alternative: Rollforward to state just prior to abortAlternative: Rollforward to state just prior to abort
• Incorrect dataIncorrect data– Preferred recovery: rollbackPreferred recovery: rollback– Alternative 1: rerun transactions not including inaccurate data updatesAlternative 1: rerun transactions not including inaccurate data updates– Alternative 2: compensating transactionsAlternative 2: compensating transactions
• System failure (database intact)System failure (database intact)– Preferred recovery: switch to duplicate databasePreferred recovery: switch to duplicate database– Alternative 1: rollbackAlternative 1: rollback– Alternative 2: restart from checkpointAlternative 2: restart from checkpoint
• Database destructionDatabase destruction– Preferred recovery: switch to duplicate databasePreferred recovery: switch to duplicate database– Alternative 1: rollforwardAlternative 1: rollforward– Alternative 2: reprocess transactionsAlternative 2: reprocess transactions
Chapter 13 3535
Concurrency ControlConcurrency Control
• ProblemProblem–in a multi-user environment, –in a multi-user environment, simultaneous access to data can result in simultaneous access to data can result in interference and data lossinterference and data loss
• SolutionSolution––Concurrency ControlConcurrency Control– The process of managing simultaneous The process of managing simultaneous
operations against a database so that data operations against a database so that data integrity is maintained and the operations do integrity is maintained and the operations do not interfere with each other in a multi-user not interfere with each other in a multi-user environmentenvironment
Chapter 13 3636
Figure 13-11 Lost update (no concurrency control in effect)
Simultaneous access causes updates to cancel each otherA similar problem is the inconsistent readinconsistent read problem
Chapter 13 3737
Concurrency Control TechniquesConcurrency Control Techniques
• SerializabilitySerializability– Finish one transaction before starting anotherFinish one transaction before starting another
• Locking Mechanisms Locking Mechanisms – The most common way of achieving The most common way of achieving
serializationserialization– Data that is retrieved for the purpose of Data that is retrieved for the purpose of
updating is locked for the updaterupdating is locked for the updater– No other user can perform update until No other user can perform update until
unlockedunlocked
Chapter 13 3838
Figure 13-12: Updates with locking (concurrency control)
This prevents the lost update problem
Chapter 13 3939
Locking MechanismsLocking Mechanisms• Locking level:Locking level:
– Database–used during database updatesDatabase–used during database updates– Table–used for bulk updatesTable–used for bulk updates– Block or page–very commonly usedBlock or page–very commonly used– Record–only requested row; fairly commonly usedRecord–only requested row; fairly commonly used– Field–requires significant overhead; impracticalField–requires significant overhead; impractical
• Types of locks:Types of locks:– Shared lock–Read but no update permitted. Used Shared lock–Read but no update permitted. Used
when just reading to prevent another user from when just reading to prevent another user from placing an exclusive lock on the recordplacing an exclusive lock on the record
– Exclusive lock–No access permitted. Used when Exclusive lock–No access permitted. Used when preparing to updatepreparing to update
Chapter 13 4040
DeadlockDeadlock• An impasse that results when two or more transactions An impasse that results when two or more transactions
have locked common resources, and each waits for the have locked common resources, and each waits for the other to unlock their resourcesother to unlock their resources
Figure 13-13The problem of deadlock
John and Marsha will wait John and Marsha will wait forever for each other to forever for each other to release their locked release their locked resources!resources!
Chapter 13 4141
Managing DeadlockManaging Deadlock• Deadlock prevention:Deadlock prevention:
– Lock all records required at the beginning of a transactionLock all records required at the beginning of a transaction– Two-phase locking protocolTwo-phase locking protocol
• Growing phaseGrowing phase• Shrinking phaseShrinking phase
– May be difficult to determine all needed resources in May be difficult to determine all needed resources in advanceadvance
• Deadlock Resolution:Deadlock Resolution:– Allow deadlocks to occurAllow deadlocks to occur– Mechanisms for detecting and breaking themMechanisms for detecting and breaking them
• Resource usage matrixResource usage matrix
Chapter 13 4242
VersioningVersioning
• Optimistic approach to concurrency controlOptimistic approach to concurrency control• Instead of lockingInstead of locking• Assumption is that simultaneous updates will be Assumption is that simultaneous updates will be
infrequentinfrequent• Each transaction can attempt an update as it Each transaction can attempt an update as it
wisheswishes• The system will reject an update when it senses a The system will reject an update when it senses a
conflictconflict• Use of rollback and commit for thisUse of rollback and commit for this
Chapter 13 4343
Figure 13-15 The use of versioning
Better performance than locking
Chapter 13 4444
Data Dictionaries and RepositoriesData Dictionaries and Repositories• Data dictionaryData dictionary
– Documents data elements of a databaseDocuments data elements of a database
• System catalogSystem catalog– System-created database that describes all database objectsSystem-created database that describes all database objects
• Information RepositoryInformation Repository– Stores metadata describing data and data processing Stores metadata describing data and data processing
resourcesresources
• Information Repository Dictionary System (IRDS)Information Repository Dictionary System (IRDS)– Software tool managing/controlling access to information Software tool managing/controlling access to information
repositoryrepository
Chapter 13 4545
Figure 13-16 Three components of the repository system architecture
A schema of the repository information
Software that manages the repository objects
Where repository objects are stored
Source: adapted from Bernstein, 1996.
Chapter 13 4646
Database Performance TuningDatabase Performance Tuning• DBMS InstallationDBMS Installation
– Setting installation parametersSetting installation parameters• Memory Usage Memory Usage
– Set cache levelsSet cache levels– Choose background processesChoose background processes
• Input/Output (I/O) ContentionInput/Output (I/O) Contention– Use stripingUse striping– Distribution of heavily accessed filesDistribution of heavily accessed files
• CPU UsageCPU Usage– Monitor CPU loadMonitor CPU load
• Application tuningApplication tuning– Modification of SQL code in applicationsModification of SQL code in applications
Chapter 13 4747
Data AvailabilityData Availability
• Downtime is expensiveDowntime is expensive• How to ensure availabilityHow to ensure availability
– Hardware failures–provide redundancy for fault Hardware failures–provide redundancy for fault tolerancetolerance
– Loss of data–database mirroringLoss of data–database mirroring– Maintenance downtime–automated and non-Maintenance downtime–automated and non-
disruptive maintenance utilitiesdisruptive maintenance utilities– Network problems–careful traffic monitoring, Network problems–careful traffic monitoring,
firewalls, and routersfirewalls, and routers
Chapter 13 4848
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic,
mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America.
Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Copyright © 2009 Pearson Education, Inc. Publishing as Prentice HallHall