lecture # 30 data organization and binary search

42
Lecture # 30 Data Organization and Binary Search

Upload: dorthy-newman

Post on 14-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture # 30 Data Organization and Binary Search

Lecture # 30

Data Organization

and Binary Search

Page 2: Lecture # 30 Data Organization and Binary Search

Data Organization

Page 3: Lecture # 30 Data Organization and Binary Search

Problem

• Huge amounts of information

• How do I find– Information that I know I want– Information related to what I want

• How do I understand– Particular pieces of information– The whole collection of information

Page 4: Lecture # 30 Data Organization and Binary Search

Limitations

• Screen space

• Network bandwidth– Bandwidth - how much information can be

transmitted per second

• Human attention

Page 5: Lecture # 30 Data Organization and Binary Search

Kinds of things to organize

• Menu items– MS Word - about 150 menu items

• Text– Pages in a book - 500– Documents on the WWW - gazillions

• Images– All of the pictures created in a commercial

advertising company

Page 6: Lecture # 30 Data Organization and Binary Search

Kinds of things to organize

• Sounds– Sound tracks to all TV and Radio news broadcasts

• Video– A complete collection of classic movies

• Structured information (records)– People– Cars– Students– Electronic appliance parts

Page 7: Lecture # 30 Data Organization and Binary Search

A question of scale

• 10 things

• 100 things - menu

• 1,000 things - files on your computer

• 10,000 things - students at a university

• 1,000,000 things - books in a library

• gazillion things - WWW pages

Page 8: Lecture # 30 Data Organization and Binary Search

Three ways to find things

• Lists – arrays

• Trees – organize in to categories

• Search – describe what you want and have the computer

find it

Page 9: Lecture # 30 Data Organization and Binary Search

The Phone Book Challenge

• How long will it take to find “Bill Lund” in the BYU Directory?

• How long will it take to find “422-8766” in the BYU Directory?

Page 10: Lecture # 30 Data Organization and Binary Search

What Algorithm did you use to search the phone book?

• Where did you start?

• How many steps did it take?

• Is there a more efficient way?

Page 11: Lecture # 30 Data Organization and Binary Search

Binary search - for “Goodrich”

Page 12: Lecture # 30 Data Organization and Binary Search

Binary search - for “Goodrich”

Lower = 0Upper = 10

Guess = (0+10)/2 = 5

Page 13: Lecture # 30 Data Organization and Binary Search

Binary search - for “Goodrich”

Lower = 0Upper = 5

Guess = (0+5)/2 = 2

Page 14: Lecture # 30 Data Organization and Binary Search

Binary search - for “Goodrich”

Lower = 2Upper = 5

Guess = (2+5)/2 = 3

Page 15: Lecture # 30 Data Organization and Binary Search

Binary search - for “Goodrich”

Lower = 3Upper = 5

Guess = (3+5)/2 = 4

Page 16: Lecture # 30 Data Organization and Binary Search

Binary search

• If there are 64 things in a list, how many times can you divide that list in half?– 32, 16, 8, 4, 2, 1

• 6 times

Page 17: Lecture # 30 Data Organization and Binary Search

Binary search

• If there are 1024 things in a list, how many times can you divide that list in half?– 512, 256, 128, 64, 32, 16, 8, 4, 2, 1

• 10 times

Page 18: Lecture # 30 Data Organization and Binary Search

Binary search

• If the size of the list doubles, how many more steps are required in a binary search?

1

Page 19: Lecture # 30 Data Organization and Binary Search

Binary search

• If there are N items in a list then binary search takes

• log2(N) steps

Page 20: Lecture # 30 Data Organization and Binary Search

Binary search

• Estimating log2(N)– Count the number of digits and multiply by 2.5

• 1000– 4*2.5 = 10 steps

• 1,000,000– 7*2.5 = 17-18 steps

• 1,000,000,000– 10*2.5= 25 steps

Page 21: Lecture # 30 Data Organization and Binary Search

Provo/Orem phone book

• How long to find “Bill Lund?”~ 5000 in the BYU Directory

–Log2(5000) approx 4*2.5 = 10 steps

Page 22: Lecture # 30 Data Organization and Binary Search

How to find a phone number

• 920-3231– 1 step

• 130-2313– 11 steps

• Average?– 5 steps

• Average N?– N/2

Page 23: Lecture # 30 Data Organization and Binary Search

Provo/Orem phone book

• How many steps to find a phone number?– 5,000/2 = 2,500 average

• How can we improve this?

Page 24: Lecture # 30 Data Organization and Binary Search

Sort the phone book by phone number

• What if I want to search on both name and number?

Page 25: Lecture # 30 Data Organization and Binary Search

Using an IndexLast Name Phone number

Page 26: Lecture # 30 Data Organization and Binary Search

Using an IndexLast Name Phone number

Anderson

Page 27: Lecture # 30 Data Organization and Binary Search

Using an IndexLast Name Phone number

Anderson, Bilinski

Page 28: Lecture # 30 Data Organization and Binary Search

Using an IndexLast Name Phone number

Anderson, Bilinski, Clark

Page 29: Lecture # 30 Data Organization and Binary Search

Using an IndexLast Name Phone number

Anderson, Bilinski, Clark, Garcia

Page 30: Lecture # 30 Data Organization and Binary Search

Using an IndexLast Name Phone number

123-3123

Page 31: Lecture # 30 Data Organization and Binary Search

Using an IndexLast Name Phone number

123-3123, 130-2313

Page 32: Lecture # 30 Data Organization and Binary Search

Using an IndexLast Name Phone number

123-3123, 130-2313, 232-0312

Page 33: Lecture # 30 Data Organization and Binary Search

Using an IndexLast Name Phone number

123-3123, 130-2313, 232-0312, 238-1234

Page 34: Lecture # 30 Data Organization and Binary Search

Search for GoodrichLast Name

Lower = 0Upper = 10

Guess = 5

lower

Page 35: Lecture # 30 Data Organization and Binary Search

Search for GoodrichLast Name

Lower = 0Upper = 5

Guess = 2

above

Page 36: Lecture # 30 Data Organization and Binary Search

Search for GoodrichLast Name

Lower = 2Upper = 5

Guess = 3

above

Page 37: Lecture # 30 Data Organization and Binary Search

Search for GoodrichLast Name

Lower = 3Upper = 5

Guess = 4

above

Page 38: Lecture # 30 Data Organization and Binary Search

Search for 823-1242

Lower = 0Upper = 10

Guess = 5

above

Phone number

Page 39: Lecture # 30 Data Organization and Binary Search

Search for 823-1242

Lower = 5Upper = 10

Guess = 7

below

Phone number

Page 40: Lecture # 30 Data Organization and Binary Search

Search for 823-1242

Lower = 5Upper = 7

Guess = 6

MATCH

Phone number

Page 41: Lecture # 30 Data Organization and Binary Search

Using an IndexLast Name Phone number

• What about first name or city?– another index

Page 42: Lecture # 30 Data Organization and Binary Search

Data Organization Summary

• What are we organizing for?• Scale

– 10 - 1,000 - 1,000,000 - 1,000,000,000

• Lists– Unsorted (N/2)– Sorted Log2(N)

• count the digits and multiply by 2.5

• To access in many ways– Use many indices into the same data