chapter ten. storage categories storage medium is required to store information/data primary memory...

Chapter Ten

Storage CategoriesStorage medium is required to store

information/dataPrimary memory

can be accessed by the CPU directly Fast, expensive and limited in capacity Volatile

Secondary memory Data on SM cannot be processed by CPU directly Slow, larger capacity, less expensive Non-volatile

Secondary storage is the media of database storage

Disks and FilesDBMS stores information on (“hard”)

disks.This has major implications for DBMS

design!READ: transfer data from disk to main memory

(RAM).WRITE: transfer data from RAM to disk.

Both are high-cost operations, relative to in-memory operations, so must be planned carefully!

Why Not Store Everything in Main Memory?Costs too muchMain memory is volatile. We want data to be

saved between runs. (Obviously!)Typical storage hierarchy:

Main memory (RAM) for currently used data.Disk for the main database (secondary

storage).Tapes for archiving older versions of the data

(tertiary storage).

DisksSecondary storage device of choice. Main advantage over tapes: random access

vs. sequential.Data is stored and retrieved in units called

disk blocks or pages.Unlike RAM, time to retrieve a disk page

varies depending upon location on disk. Therefore, relative placement of pages on disk

has major impact on DBMS performance!

Components of a Disk

Platters

The platters spin (say, 90rps).

Spindle

The arm assembly is moved in or out to position a head on a desired track. Tracks under heads make a cylinder (imaginary!).

Disk head

Arm movement

Arm assembly

Only one head reads/writes at any one time.

Tracks

Sector

Block size is a multiple of sector size (which is fixed).

Disk Storage DevicesThe division of a track into sectors is hard-

coded on the disk surface and cannot be changed. One type of sector organization calls a portion of a track that subtends a fixed angle at the center as a sector.

A track is divided into blocks. The block size B is fixed for each system. Typical block sizes range from B=512 bytes to B=4096 bytes. Whole blocks are transferred between disk and main memory for processing.

Accessing a Disk PageTime to access (read/write) a disk block:

seek time (moving arms to position disk head on track)rotational delay (waiting for block to rotate under head)transfer time (actually moving data to/from disk surface)

Seek time and rotational delay dominate.Seek time varies from about 1 to 20msecRotational delay varies from 0 to 10msecTransfer rate is about 1msec per 4KB page

Key to lower I/O cost: reduce seek/rotation delays! Hardware vs. software solutions?

Buffer Management in a DBMS

Data must be in RAM for DBMS to operate on it!Table of <frame#, pageid> pairs is maintained.

DB

MAIN MEMORY

DISK

disk page

free frame

Page Requests from Higher Levels

BUFFER POOL

choice of frame dictatedby replacement policy

Records and FilesRecord consists of a collection of related data

values or items (or fields, column etc)A file is a sequence of records made up of

Fixed-length recordsVariable-length records

A database is stored as a collection of files

Record Formats: Fixed Length

Information about field types same for all records in a file; stored in system catalogs.

Finding i’th field does not require scan of record.

Base address (B)

L1 L2 L3 L4

F1 F2 F3 F4

Address = B+L1+L2

Fixed-Length RecordsSimple approach:

Store record i starting from byte n (i – 1), where n is the size of each record.

Record access is simple but records may cross blocks Modification: do not allow records to cross block boundaries

Deletion of record i: alternatives:move records i + 1, . . ., n

to i, . . . , n – 1move record n to ido not move records, but

link all free records on afree list

13

Fixed Length Records –Deletion Store the address of the first deleted record in the file

header. Use this first record to store the address of the second

deleted record, and so on Can think of these stored addresses as pointers since they

“point” to the location of a record. More space efficient representation: reuse space for normal

attributes of free records to store pointers. (No pointers stored in in-use records.)

Variable-Length Records

(b & c on the diagram)

Record Organization (on Disks)a) Unspanned. (b) Spanned

Files of RecordsPage or block is OK when doing I/O, but

higher levels of DBMS operate on records, and files of records.

FILE: A collection of pages, each containing a collection of records. Must support:insert/delete/modify recordread a particular record (specified using record

id)scan all records (possibly with some conditions

on the records to be retrieved)

File Organization & Access MethodFile organization refers to physical

arrangement of data in a file into records and pages of the secondary storage

Access method refers to the steps involved in storing and retrieving record from a file

Some common file organizations and access methods are discussed now

File Organization and Access Method – Unordered FileAlso called a heap or a pile file.Simplest file structure contains records in no

particular order.As file grows and shrinks, disk pages are

allocated and de-allocated.New records are inserted at the end of the file.To search for a record, a linear search through

the file records is necessary. This requires reading and searching half the file blocks on the average, and is hence quite expensive.

Record insertion is quite efficient.Reading the records in order of a particular field

requires sorting the file records

File Organization and Access Method – Ordered FileAlso called a sequential file.File records are kept sorted by the values of an ordering

field.Insertion is expensive: records must be inserted in the

correct order. It is common to keep a separate unordered overflow (or transaction ) file for new records to improve insertion efficiency; this is periodically merged with the main ordered file.

A binary search can be used to search for a record on its ordering field value. This requires reading and searching log2 of the file blocks on the average, an improvement over linear search.

Reading the records in order of the ordering field is quite efficient

Ordered File

File Organization and Access Method – Hash FilesHashing for disk files is called External HashingThe file blocks are divided into M equal-sized buckets,

numbered bucket0, bucket1, ..., bucket M-1. Typically, a bucket corresponds to one (or a fixed number of) disk block.

One of the file fields is designated to be the hash key of the file.

The record with hash key value K is stored in bucket i, where i=h(K), and h is the hashing function.

Search is very efficient on the hash key.Collisions occur when a new record hashes to a bucket that

is already full. An overflow file is kept for storing such records. Overflow records that hash to each bucket can be linked together.

Hash Files

23

Indexing Structures for FilesIndex is a data structure that allows the DBMS to

locate a particular records in a file more quickly and thereby speed response to user queries

An index file consists of records (called index entries) of the form

Any subset of the fields of a relation can be the search key for an index on the relation.

Search key is not the same as key (minimal set of fields that uniquely identify a record in a relation).

Index files are typically much smaller than the original file

search-key pointer

Types of IndexesThere are different types of indexes

Single-level IndexesPrimary IndexesClustering IndexesSecondary Indexes

Multilevel Indexes

Single Level IndexA single-level index is an auxiliary file that

makes it more efficient to search for a record in the data file.

The index is usually specified on one field of the file (although it could be specified on several fields)

One form of an index is a file of entries <field value, pointer to record>, which is ordered by field value

Types of Single Level IndexesPrimary Index

Defined on an ordered data fileThe data file is ordered on a key fieldIncludes one index entry for each block in the data file;

the index entry has the key field value for the first record in the block, which is called the block anchor

A similar scheme can use the last record in a block.A primary index is a nondense (sparse) index, since it

includes an entry for each disk block of the data file and the keys of its anchor record rather than for every search value.

Primary Index

An Example of Dense Index File

Types of Single Level IndexesClustering Index

Defined on an ordered data fileThe data file is ordered on a non-key field unlike

primary index, which requires that the ordering field of the data file have a distinct value for each record.

Includes one index entry for each distinct value of the field; the index entry points to the first data block that contains records with that field value.

It is another example of nondense index where Insertion and Deletion is relatively straightforward with a clustering index

Clustering Index

Clustering Index With a Separate Block Per Record Group

Types of Single Level Indexes Secondary Index

A secondary index provides a secondary means of accessing a file for which some primary access already exists.

The secondary index may be on a field which is a candidate key and has a unique value in every record, or a nonkey with duplicate values.

The index is an ordered file with two fields. The first field is of the same data type as some

nonordering field of the data file that is an indexing field.

The second field is either a block pointer or a record pointer. There can be many secondary indexes (and hence, indexing fields) for the same file.

Includes one entry for each record in the data file; hence, it is a dense index

Secondary Index

Multi-Level IndexesBecause a single-level index is an ordered file,

we can create a primary index to the index itself ; in this case, the original index file is called the first-level index and the index to the index is called the second-level index.

We can repeat the process, creating a third, fourth, ..., top level until all entries of the top level fit in one disk block

A multi-level index can be created for any type of first-level index (primary, secondary, clustering) as long as the first-level index consists of more than one disk block

Multi-Level Index

chapter ten. storage categories storage medium is required to store information/data primary memory...

Documents

disk head

disk blocks

disk platters

disk pagetime

main memory ram

data tertiary storage

transfer data

used data