cs350-ch01 - data structures and algorithms

34
1 Chapter 01 - A Practical Introduction to Data Structures and Algorithm Analysis Class 01 Chaminade University of Honolulu Department of Computer Science Text by Clifford Shaffer Modified by Dr. Martins CS360-Data Structures

Upload: prafullsingh

Post on 06-Mar-2015

122 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: CS350-CH01 - Data Structures and Algorithms

1

Chapter 01 - A Practical Introduction to

Data Structures and Algorithm AnalysisClass 01

Chaminade University of Honolulu

Department of Computer Science

Text by Clifford Shaffer Modified by Dr. Martins

CS360-Data Structures

Page 2: CS350-CH01 - Data Structures and Algorithms

2

The Need for Data Structures

Data structures organize data more efficient programs.

More powerful computers more complex applications.

More complex applications demand more calculations.

Complex computing tasks are unlike our everyday experience.

Page 3: CS350-CH01 - Data Structures and Algorithms

3

What is a data structure? In a general sense, any data

representation is a data structure. Example: An integer

More typically, a data structure is meant to be an organization for a collection of data items.

Page 4: CS350-CH01 - Data Structures and Algorithms

4

Organizing Data

Any organization for a collection of records can be searched, processed in any order, or modified.

The choice of data structure and algorithm can make the difference between a program running in a few seconds or many days.

Page 5: CS350-CH01 - Data Structures and Algorithms

5

Efficiency

A solution is said to be efficient if it solves the problem within its resource constraints. Space Time

The cost of a solution is the amount of resources that the solution consumes.

Page 6: CS350-CH01 - Data Structures and Algorithms

6

Costs and Benefits A data structure requires a certain

amount of: space for each data item it stores time to perform a single basic

operation programming effort.

Page 7: CS350-CH01 - Data Structures and Algorithms

7

Example: Banking Application Operations are (typically):

Open accounts (far less often than access)

Close accounts (far less often than access)

Access account to Add money Access account to Withdraw money

Page 8: CS350-CH01 - Data Structures and Algorithms

8

Example: Banking Application Teller and ATM transactions are

expected to take little time. Opening or closing an account can

take much longer (perhaps up to an hour).

Page 9: CS350-CH01 - Data Structures and Algorithms

9

Example: Banking Application When considering the choice of

data structure to use in the database system that manages the accounts, we are looking for a data structure that: Is inefficient for deletion Highly efficient for search Moderately efficient for insertion

Page 10: CS350-CH01 - Data Structures and Algorithms

10

Example: Banking Application One data structure that meets these

requirements is the hash table (chapter 9). Records are accessible by account number

(called an exact-match query) Hash tables allow for extremely fast exact-

match search. Hash tables also support efficient insertion of

new records. Deletions can also be supported efficiently (but

too many deletions lead to some degradation in performance – requiring the hash table to be reorganized).

Page 11: CS350-CH01 - Data Structures and Algorithms

11

Example: City Database Database system for cities and towns. Users find information about a

particular place by name (exact-match query)

Users also find all places that match a particular value (or range of values), such as location or population size (called a range query).

Page 12: CS350-CH01 - Data Structures and Algorithms

12

Example: City Database The database must answer queries

quickly enough to satisfy the patience of a typical user.

For an exact-match query, a few seconds is satisfactory

For a range queries, the entire operation may be allowed to take longer, perhaps on the order of a minute.

Page 13: CS350-CH01 - Data Structures and Algorithms

13

Example: City Database The hash table is inappropriate for

implementing the city database because: It cannot perform efficient range queries

The B+ tree (section 10) supports large databases: Insertion Deletion Range queries

If the database is created once and then never changed, a simple linear index would be more appropriate.

Page 14: CS350-CH01 - Data Structures and Algorithms

14

Selecting a Data Structure

Select a data structure as follows:

1. Analyze the problem to determine the resource constraints a solution must meet.

2. Determine the basic operations that must be supported. Quantify the resource constraints for each operation.

3. Select the data structure that best meets these requirements.

Page 15: CS350-CH01 - Data Structures and Algorithms

15

Some Questions to Ask

Are all data inserted into the data structure at the beginning, or are insertions intersparsed with other operations?

Can data be deleted? Are all data processed in some well-

defined order, or is random access allowed?

Page 16: CS350-CH01 - Data Structures and Algorithms

16

Data Structure Philosophy

Each data structure has costs and benefits.

Rarely is one data structure better than another in all situations.

A data structure requires: space for each data item it stores, time to perform each basic operation, programming effort.

Page 17: CS350-CH01 - Data Structures and Algorithms

17

Data Structure Philosophy

Each problem has constraints on available space and time.

Only after a careful analysis of problem characteristics can we know the best data structure for the task.

Bank example: Start account: a few minutes Transactions: a few seconds Close account: overnight

continued

Page 18: CS350-CH01 - Data Structures and Algorithms

18

Goals of this Course

1. Reinforce the concept that costs and benefits exist for every data structure.

2. Learn the commonly used data structures. These form a programmer's basic data structure

``toolkit.'‘

3. Understand how to measure the cost of a data structure or program.

These techniques also allow you to judge the merits of new data structures that you or others might invent.

Page 19: CS350-CH01 - Data Structures and Algorithms

19

Abstract Data Types

Abstract Data Type (ADT): a definition for a data type solely in terms of a set of values and a set of operations on that data type.

Each ADT operation is defined by its inputs and outputs.

Encapsulation: Hide implementation details.

Page 20: CS350-CH01 - Data Structures and Algorithms

20

Data Structure

A data structure is the physical implementation of an ADT.

Each operation associated with the ADT is implemented by one or more subroutines in the implementation.

In a OO language such as C++, an ADT and its implementation together make up a class.

Data structure usually refers to an organization for data in main memory.

File structure: an organization for data on peripheral storage, such as a disk drive.

Page 21: CS350-CH01 - Data Structures and Algorithms

21

Labeling collections of objects

Humans deal with complexity by assigning a label to an assembly of objects. An ADT manages complexity through abstraction. Hierarchies of labels

Ex1: transistors gates CPU.

In a program, implement an ADT, then think only about the ADT, not its implementation.

Page 22: CS350-CH01 - Data Structures and Algorithms

22

Logical vs. Physical Form

Data items have both a logical and a physical form.

Logical form: definition of the data item within an ADT. Ex: Integers in mathematical sense: +, -

Physical form: implementation of the data item within a data structure. Ex: 16/32 bit integers, overflow.

Page 23: CS350-CH01 - Data Structures and Algorithms

23

Data Type

ADT:TypeOperations

Data Items: Logical Form

Data Items: Physical Form

Data Structure:Storage SpaceSubroutines

Page 24: CS350-CH01 - Data Structures and Algorithms

24

Problems, Algorithms and Programs

Programmers deal with: problems, algorithms and computer programs.

These are distinct concepts…

Page 25: CS350-CH01 - Data Structures and Algorithms

25

Problems

Problem: a task to be performed. Best thought of as inputs and matching

outputs. Problem definition should include constraints

on the resources that may be consumed by any acceptable solution.

Page 26: CS350-CH01 - Data Structures and Algorithms

26

Problems (cont)

Problems mathematical functions A function is a matching between inputs (the domain)

and outputs (the range). An input to a function may be single number, or a

collection of information. The values making up an input are called the

parameters of the function. A particular input must always result in the same

output every time the function is computed.

Page 27: CS350-CH01 - Data Structures and Algorithms

27

Algorithms and Programs

Algorithm: a method or a process followed to solve a problem. A recipe: The algorithm gives us a “recipe” for solving

the problem by performing a series of steps, where each step is completely understood and doable.

An algorithm takes the input to a problem (function) and transforms it to the output. A mapping of input to output.

A problem can be solved by many algorithms.

Page 28: CS350-CH01 - Data Structures and Algorithms

28

A problem can have many algorithms

For example, the problem of sorting can be solved by the following algorithms:

Insertion sort Bubble sort Selection sort Shellsort Mergesort Others

Page 29: CS350-CH01 - Data Structures and Algorithms

29

Algorithm Properties

An algorithm possesses the following properties: It must be correct. It must be composed of a series of concrete steps. There can be no ambiguity as to which step will be performed

next. It must be composed of a finite number of steps. It must terminate.

A computer program is an instance, or concrete representation, for an algorithm in some programming language.

Page 30: CS350-CH01 - Data Structures and Algorithms

30

Programs A computer program is a concrete

representation of an algorithm in some programming language.

Naturally, there are many programs that are instances of the same algorithms, since any modern programming language can be used to implement any algorithm.

Page 31: CS350-CH01 - Data Structures and Algorithms

31

To Summarize: A problem is a function or a

mapping of inputs to outputs. An algorithm is a recipe for solving

a problem whose steps are concrete and ambiguous.

A program is an instantiation of an algorithm in a computer programming language.

Page 32: CS350-CH01 - Data Structures and Algorithms

32

Example Problem: find y = x to the power of

2 Algorithm1: Multiply X by X Algorithm2: Add X to itself X times Program1: for (int i = 0; i<x; i++) y +=x;

Page 33: CS350-CH01 - Data Structures and Algorithms

33

Example (cont.)

Program2: (Assembly Intel 8086) mov bl,x // read x mov al,bl // store x mov cl,bl // int counterloop: add al,bl // al = al + x dec cl // decrement loop ctr jnz loop

Page 34: CS350-CH01 - Data Structures and Algorithms

34

In class exercises

Think of a program you have used that is unacceptably slow. Identify other basic operations that the program performs quickly enough.

Imagine that you are a shipping clerk for a large company. You have just been handed about 1000 invoices, each of which is a single sheet of paper with a large number in the upper right corner. The invoices must be sorted by this number, in order from lowest to highest. Write down as many different approaches to sorting the invoices as you can think of.