cse3180 semester 1 2005 week 7 / 1 lecture 7 data storage and access methods
TRANSCRIPT
CSE3180 Semester 1 2005 Week 7 / 1
Lecture 7Lecture 7
Data Storage and Access Methods
CSE3180 Semester 1 2005 Week 7 / 2
Access Methods and Storage OrganisationAccess Methods and Storage Organisation
This lecture will introduce some aspects associated withData Access Methods (disk based) and the associatedStorage Organisations.
These factors are part of the Implementation Planningstage and can be regarded as ‘tuning’ in that theyprovide a sound basis for performance and thereforeresponse times. There are also additional Integritycontrol functions which can be introduced into thedatabase at this stage.
CSE3180 Semester 1 2005 Week 7 / 3
CSE3180 Semester 1 2005 Week 7 / 4
TerminologyTerminology
MEDIA : Magnetic Disks, Optical Disks, CD Roms, Other devices (Smart Cards)
TERMS : Seek Time, Rotational Delay, Cylinder, Track, Sector, Block, Page, Device
ACCESS TIMES ~ 400ms or less - floppy disk 15 to 20 ms or less large fast disk
(is a CD-Rom faster ?)AIM OF Storage Structures and DBMS : To reduce the number of I/O’s
STORAGE STRUCTURE : An arrangement of data on a storage medium
Data Access Software 1. Disk Manager (page level) 2. File Manager (record level)
CSE3180 Semester 1 2005 Week 7 / 5
Storage TermsStorage Terms
• Track : Concentric division of the surface of a disk• Sector : Fixed size component of a track• Cylinder : Set of tracks in the corresponding position of all
surfaces of a disk pack• Page : Physical amount of data transferred between
memory and disk. Can be 1024, 2048, 4096 bytes depending on the disk unit and the controller
• Bucket : Used to identify a physical storage address. Normally holds a number of logical records
• Packing Density : Ratio of stored records to the number of spaces
CSE3180 Semester 1 2005 Week 7 / 6
Storage TermsStorage Terms
Data is accessed a ‘page’ at a time
1 disk Input/Output process (I/O) is required per pageKeyed storage structures allow direct access to a pageMain pages are those originally allocated to a storage structure to hold rows
Overflow pages are added as the table grows and themain page is full. Duplicate rows can cause overflowsChoosing the appropriate structure is important for:
• performance• concurrency• disk space availability
CSE3180 Semester 1 2005 Week 7 / 7
Storage TermsStorage Terms
Tables are stored in files, files are divided into pages.
An Oracle Page = 2048 bytes
Approx. 2008 bytes for user data and 40 bytes for
Ingres overheads
Pages are divided into records. Records cannot span
pages
Record width = width of row + 2 bytes. There are 2
bytes used for start of row. This is not reused when (if) the row is deleted
CSE3180 Semester 1 2005 Week 7 / 8
Storage TermsStorage Terms
• Fill Factor : Percentage of page which should be occupied by rows of data
• Min Pages : Minimum number of pages (of storage) a table is to be allocated (watch for defaults)
• Max Pages : Maximum number of pages a table is allocated
• Index : A table or other data structure that is used to determine the location of rows in a tables (or
tables) that satisfy some condition
CSE3180 Semester 1 2005 Week 7 / 9
Internal (Physical) StorageInternal (Physical) Storage
Directories:
Disk Table of Contents
Disk Directory
Page Set Directory
Index Directory
CSE3180 Semester 1 2005 Week 7 / 10
Storage StructuresStorage Structures
Application Buffers DBMS Buffers Operating System
Logical records Logical/Physical Physical Records
LR1 read LR1 read
LR2 LR2 PR1
LR3 write LR3 PR2
LR4 LR4write
CSE3180 Semester 1 2005 Week 7 / 11
Storage StructuresStorage Structures
• At this level a database consists of physical records (aka blocks or pages)
• These are organised into filesA file is a collection of physical records organised for efficient access
• A physical record is a collection of bytes which are transferred between volatile storage (memory) in main memory and stable storage on disk
CSE3180 Semester 1 2005 Week 7 / 12
• A physical record contains multiple logical records
• The size of a physical record is a power of two - such as 1024 (210) , 4096 (212)
• A large logical record may be split over multiple physical records - and logical records from more than one table may be stored in the same physical record
Storage StructuresStorage Structures
CSE3180 Semester 1 2005 Week 7 / 13
Data Base AccessData Base Access
DBMS
FileManager
Disk Manager
DataBase
Request for stored record
Request for stored page
Disk I/O Data read fromdisk
Stored pageReturned
Stored recordreturned
Functions
RetrieveReplaceAddRemoveCreateDestroy
CSE3180 Semester 1 2005 Week 7 / 14
A Possible Problem ?A Possible Problem ?
Main memory capacity is always less than Disk Storage capacity
But in both media, each ‘piece of data’ is addressable
How can a ‘piece of data’ on a hard disk range be directed into memory and ‘alter’ its address and be locatable ?
And how can memory located data be directed to the high order areas of a mass storage disk device ?
There is a technique known as swizzling which addresses this interchange
CSE3180 Semester 1 2005 Week 7 / 15
Storage StructuresStorage Structures
Objective ; To minimise the number of Disk Accesses ( I/O's)
Access times are in the range 15ms to 300ms
STORAGE STRUCTURES : Arrangement of data on the storage medium
NO storage structure will optimise ALL Application requirements
DBMS systems should provide for a number of structures
Requires a sound understanding of the uses of the database (as determined in the LOGICAL design.)
Some Terms PAGE MANAGEMENT - DISK MANAGER
RECORD MANAGEMENT - FILE MANAGER
CSE3180 Semester 1 2005 Week 7 / 16
Access CalculationAccess Calculation
Number of Pages: 300,000
No. of Disk I/O’s per second 30
Assume 1 disk I/O per page
Assume HEAP Storage structure
select * from employee
where employee_name = “Johnson”;
(and remember the closure feature of SQL)
Time taken = 300,000 / 30 = 10,000 seconds
or approximately 3 hours
CSE3180 Semester 1 2005 Week 7 / 17
Access MethodsAccess Methods
LINEAR and NON-LINEAR
1. Indexing - Sequential Index Keys
- Direct Primary Index (index on
Primary Key)
- Inverted Lists Secondary Index on non-primary key
2. Hashing - Direct Access Attribute(s) value
Computed Disk Address
A database can have any number of Indexes, BUT only ONE HASH STRUCTURE
CSE3180 Semester 1 2005 Week 7 / 18
File OrganisationsFile Organisations
A file organisation is a technique for physically arranging the records of a file on a secondary storage device.
File organisations
Sequential Indexed Direct
Sequential Nonsequential Relative-Addressed
Hash-Addressed
Hardware-dependent(ISAM)
Hardware-independent(VSAM)
(full index)(block index)
CSE3180 Semester 1 2005 Week 7 / 19
Sequential AccessSequential Access
• Arrangement in physical sequence -dependent on some attribute (sequencing attribute)
• Can be ordered• Serially (as the rows occur)
New additions (inserts) are placed after the last row (known as a HEAP structure)
• Increasing / Decreasing order by value of the sequencing attribute
Inserts, Deletes and Updates handled by rewriting the entire table
• Requires progression beyond a starting key value
CSE3180 Semester 1 2005 Week 7 / 20
Indexed Sequential AccessIndexed Sequential Access
There are two basic implementations of the indexed sequential organisation:
- hardware-dependent uses block index on the key, disk address to the prime area which contains the data records and the track index for the cylinder
- hardware-independent uses a control interval which may be considered a virtual track, free space for new records is provided by distributed free space.
CSE3180 Semester 1 2005 Week 7 / 21
Indexed Sequential File OrganisationIndexed Sequential File Organisation
Creates Tables - the Index tables
Methods 1. related to the exact disk address of every record (records are held sequentially)
Known as a DENSE index
2. The beginning disk address of a group of records (also held sequentially)
Known as a SPARSE index
CSE3180 Semester 1 2005 Week 7 / 22
Indexed Sequential File OrganisationIndexed Sequential File Organisation
Some terms to note:
Track, Cylinder and Volume
These lead to 1. Track Index
2. Cylinder Index
ISAM provides Sequential Access
Direct Access
CSE3180 Semester 1 2005 Week 7 / 26
Direct Access File OrganisationDirect Access File Organisation
Objective : To provide rapid, direct, non-sequential access to records
Index tables are not created
The organisation does not readily permit of sequenced output
Based on deriving a TARGET address (disk address)
Hashing algorithm on record keys(s)
CSE3180 Semester 1 2005 Week 7 / 28
Direct Access File OrganisationDirect Access File Organisation
2. Hashed or Calculated address
Based on transforming a key of each record to create a ‘unique’ value, which then becomes a disk address
Used for Open or Sparse populations
There are some possible problems:
(a) collisions - more than 1 key transform generating the
same ‘address’
(b) collisions - caused by record lengths being larger than a sector
Possible Solutions: Overflow areas with Pointer mechanism with serial loading and searching
CSE3180 Semester 1 2005 Week 7 / 29
ReorganisationReorganisation
Access performance and Disk occupancy statistics will indicate when a database reorganisation should be performed
Very time consuming (and requires additional disk space)
Data is redistributed - any existing indexes will need to be remade
Any existing pointers, or pointer chains, will need to be reset
CSE3180 Semester 1 2005 Week 7 / 31
Access MethodsAccess Methods
In sequential access, record storage starts at a designated point, usually the beginning, and proceeds in a linear sequence through the file. Each record can only be retrieved by accessing all the records that physically precede it.
Random Access
In random access, a given record is accessed "out of the blue" without referencing other records in the file.
Sequential Access
CSE3180 Semester 1 2005 Week 7 / 32
Access ComparisonsAccess Comparisons
A File organisation is established when the file is created, and is rarely changed. However, record access mode can change each time the file is used.
FileOrganisation
Record access modeSequential Random
Sequential Yes No (impractical)
Indexed Seq. Yes Yes
Direct-Relative Yes Yes
Direct-Hashed No Yes (impractical)
CSE3180 Semester 1 2005 Week 7 / 38
Hashing RoutinesHashing Routines
Records are assigned to buckets by means of a hashing routine, or transformation, which is an algorithm that converts each primary key value into a relative disk address.An example of one that consistently performs best under most conditions is: * division/remainder method1. Determine the number of buckets to be allocated to the file.
2. Select a prime number that is approximately equal to this number.
3. Divide each primary key value (usually the ASCII sum) by the prime number.
4. Use the remainder as the relative bucket address.
CSE3180 Semester 1 2005 Week 7 / 43
Binary TreesBinary Trees
A non-linear data structure, each element having several "next" elements ( branching ).
A binary tree has a maximum of two branches per element or node.
A node consist of some data and a maximum of two pointers, a left pointer to the left branch and right pointer to the right branch. If there is no left or right branch then a nil pointer is used.
CSE3180 Semester 1 2005 Week 7 / 44
A Binary Tree DiagramA Binary Tree Diagram
Primary Key
Data Less Than Pointer
Greater Than Pointer
PRODUCT# LLINK RLINK
1000 1000
1600
1000
16000350
1000
0350 1600
2000
1000
0350 1600
20000975
(1) Initial tree (2) Insert 1000 (3) Insert 1600 (4) Insert 0350
(5) Insert 2000 (6) Insert 0975 (7) Insert 0625
1000
0350 1600
200009750625
>< >
< >
>
< >
> >
< >
>>
<
CSE3180 Semester 1 2005 Week 7 / 46
An Example of a Binary Tree An Example of a Binary Tree
1000
0350 1600
20000975
0625
< >
>>
<1250
1425 1775
0100
<
> <
Task: Indicate the different traversals on this diagram.
<
CSE3180 Semester 1 2005 Week 7 / 47
Binary and B TreesBinary and B Trees
The problem with Binary Trees is balance, the tree can easily deteriorate to a linked list. Consequently, the reduced search times are lost.
This problem is overcome in B-trees.
B for Balanced, where all the leaves are the same distance from the root.
B-trees guarantee a predictable efficiency.
CSE3180 Semester 1 2005 Week 7 / 48
B+ TreesB+ Trees
There are several varieties of B-trees, most applications use the B+-tree.
A B+-tree of degree m has the following properties:
1. Every node has between [m/2] and m children (where m is an integer
> 3 and usually odd), except the root which is not bound by a lower limit.
2. All leaves are at the same level, that is the same depth from the root.
3. A nonleaf node that has n children will contain n-1 keys.
CSE3180 Semester 1 2005 Week 7 / 49
B+ Tree Node StructureB+ Tree Node Structure
P K P K P K P1 1 2 2 n-1 n-1 n
P K P K P K P1 1 2 2 n-1 n-1 n.. . . . . .
. . . . . . .
A high level node
A leaf node
Pointer tosubtree forkeys>= K & < K
Pointer tosubtree forkeys< K1 n-2 n-1
Pointer tosubtree forkeys>= K & < K1 2
Pointer tosubtree forkeys< K n-1
Pointer torecord (block)with key K
Pointer torecord (block)with key K
Pointer to leafwith smallestkey greater than K
Pointer torecord (block)with key K 1 2 n-1 n-1
CSE3180 Semester 1 2005 Week 7 / 50
B+ TreeB+ Tree
1250
0625 10001425 2000
0350 0625
1300
1250 1300 1425 1600 20000350 0625 1000
1600
1425 20001000 1250
Leaves
Actual Data Records
CSE3180 Semester 1 2005 Week 7 / 51
A review of TreesA review of Trees
Can permit rapid retrieval of data for both random and sequential processing.
Can be used on primary or secondary keys.
Trees are special cases of networks; in networks, records from different files are joined without a strict hierarchybeing observed. This is addressed in the hierarchical and network model lectures.
CSE3180 Semester 1 2005 Week 7 / 52
Some Index TerminologySome Index Terminology
The attributes which contribute to the index are called the Indexed fields
• If these attributes are built on the Primary Key,
then the Index is called a Primary Index
• If the index is built on any other attributes, it is called a secondary index and, the attributes may not be unique
Clustering Index: The index is built on NON-UNIQUE attributes and includes one index entry for each distinct value of the attribute. The index entry points to the first data block which contains corresponding values. The file must be ordered on the chosen non-unique attributes
CSE3180 Semester 1 2005 Week 7 / 53
Multi Level IndexesMulti Level Indexes
A single level index is an ordered file. It is possible therefore to create a non-dense index to an index.
This is known as a Second Level Index
The process can be repeated until the ‘highest’ level of the index processes can fit into main memory
This will probably be 1 page and has the effect of reducing the number of I/O’s by 1
This concept is known as a non-linear or tree structure
CSE3180 Semester 1 2005 Week 7 / 55
Index UsageIndex Usage
An index is used to optimise retrievals
key page no
E1 1E20 2E40 3
p1
p2
p3
E1E2E3
E20E30
E40E45E56
select *from empwhere eno = ‘E4’;
Total number of accesses = Index Accesses + 1 Data Access = 1 + 1 = 2Average no. of Serial Accesses for a Table with 3 pages is n/2 = 2
CSE3180 Semester 1 2005 Week 7 / 56
Good/Bad Candidates for IndexesGood/Bad Candidates for Indexes
(Ingres specific)
Create indexes on attributes used in predicates:-
Read only and frequently accessed tables > 3 tables
Attributes of a predicate in frequently executed transactions
High update tables > 6 pages
Attributes used in joins
Attributes where aggregates are frequently calculated
On FK’s if using RI. Integrity violations or cascade speed
Good Candidates
CSE3180 Semester 1 2005 Week 7 / 57
Good/Bad Candidates for IndexesGood/Bad Candidates for Indexes
Poor Candidates:
Attributes with a small number of unique values
High update attributes - keep to 2 or 3 if possible
CSE3180 Semester 1 2005 Week 7 / 58
Storage Structures 1Storage Structures 1
Requirements Heap Hash ISAM BTreeNeed Pattern Matching 4 4 1 1
Need Range Searches 4 4 1 1
Exact Match Key Retrieval 4 1 2 2
Sorted Data 4 4 2 1
Concurrent Updates 4 1 1 2
Add Data - No Modify 2 3 3 1
Sequential Addition of Data 1** 2 5 1
CSE3180 Semester 1 2005 Week 7 / 59
Storage Structures 2Storage Structures 2
Requirements Heap Hash ISAM BTree
Initial Bulk Copy of Data 1 2 2 2
Table Growth - nil/static N/A 1 1 2
Table growth - low (15%) N/A 1 1 2 Plan to modify periodically
Table growth - high 3 3 3 1Too fast to modify
CSE3180 Semester 1 2005 Week 7 / 60
Storage Structures 3Storage Structures 3
Requirements Heap Hash ISAM BTree
Table size : small 2 1 1 3
Table size : medium (modify 4 1 1 1disk space available)
Table size : large (>1/2 disk 2** 4 4 1
Deletes Frequent 4 1 1 3
Updates Frequent 4 1 1 2
Secondary Index Structure N/A 1 1 1 ** secondary indexes used with a heap structure
CSE3180 Semester 1 2005 Week 7 / 61
Storage Structures Storage Structures
HEAP * Supported by Ingres and Oracle
HASH * Supported by Ingres and Oracle
ISAM * Supported by Ingres
BTREE * Supported by Ingres and Oracle
* indicates that Compression is availableSORTED HEAP
Other techniques:
Bitmaps
Partition Indexing / Reverse Key Indexing
Data Clustering - Indexed Data Clusters - Hash Clusters
What does Microsoft Access support ?
CSE3180 Semester 1 2005 Week 7 / 62
Other MethodsOther Methods
Bit Mapping • This is another table and its contents are
– a bit to indicate the presence of some value– a row i.d. to reference the row (rowid)
Row 1Row 2Row 3Row 4 1
1 1
Row I.D. Green Red
% of distinct values to total values should be lowuseful for DSS/data warehouse applicationsNot good for frequent update or insert applications
CSE3180 Semester 1 2005 Week 7 / 63
Clustering - Oracle
Is a technique which ‘clusters’, or groups together, related rows of one or more tables in the same data block.
The objective is to store (on disk) rows of an application which are used together (e.g. Orders and Items ) - this saves disk I/O on analysis applications
A cluster key is necessary for each cluster.
Not very successful for high volume processing
Other MethodsOther Methods
CSE3180 Semester 1 2005 Week 7 / 64
Ranking the Storage StructuresRanking the Storage Structures
heap hash isam btree best used for 1 - - 2 bulkloading table with data - 1 1 1 removing duplicate rows - 1 2 2 exact match - - 1 1 range/pattern matching 1 3 2 2 sequential searches - - 1 1 partial key - - - 1 access to sorted data - - - 1 joins on large tables - - - 1 index grows as table grows 1 - - - very small tables - - - 1 very large tables• number indicates ranking of the various structures for the
given task• dash indicates that the structure is not appropriate for the
particular task
CSE3180 Semester 1 2005 Week 7 / 65