10-binarysearch

18
Searching COMP 103 #10

Upload: api-3799621

Post on 07-Jun-2015

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 10-binarysearch

Searching

COMP 103 #10

Page 2: 10-binarysearch

COMP103 – 2006/T3 2

Menu

• Cost of ArraySet

• Simple Searching

• Binary Search

• Assignment 4 – Due Next Wednesday

Page 3: 10-binarysearch

COMP103 – 2006/T3 COMP 103 11:3

Cost of ArraySet

• ArraySet uses same data structure as ArrayList• does not have to keep items in order

• Operations are:• contains(item)• add(item) ← always add at the end• remove(item) ← don’t need to shift down – just move last item

down

• What are the costs?

• contains:

• remove:

• add:

n

Page 4: 10-binarysearch

COMP103 – 2006/T3 4

ArrayList/ArraySet costs

ArrayList (duplicates permitted):

• get, set, O(1)

• remove, add (at i) O(n)

• add (at end) O(1) (average) O(n) (worst)

ArraySet (no duplicates):

• When adding, we need to check the element does not exist already.

• contains, add, remove: O(n)

• All additional cost in the searching.

Page 5: 10-binarysearch

COMP103 – 2006/T3 5

Duplicate Detection

Add ‘Dog’

Bee Dog Ant Fox Gnu Eel Cat

7

Bee Dog Ant Fox Gnu Eel Cat

7

Pig

8

Add ‘Pig’

Question:

• How can we speed up the search?

Page 6: 10-binarysearch

COMP103 – 2006/T3 6

How else can we search?

• Phone Book • Data is ordered, so it is• Faster to find items, because…• We know roughly how far to look

Ant Bee Cat Dog Eel Fox Gnu Pig

8

Add ‘Cat’

Page 7: 10-binarysearch

COMP103 – 2006/T3 7

An improvement, but

• This is still an O(n) search.• Think about how you search a phone book…• Exercise: Write down where your eyes scan

when looking for G & W in the list:

AAABBBCCCDDDDEEEEFFFGGGHHHIIIJJJJJJJJJKKKKKKLLLLLMMMMMNNOOOPPQQRRRSSSTTTTTUUVVVVVWXXXYZZZ

Page 8: 10-binarysearch

COMP103 – 2006/T3 8

A better strategy…

• We can model a search pattern on the human search pattern.

• Idea:• Look at the middle.• is it before or after?• if before, look at the middle of the first half,• if after, look at the middle of the second half.

• This is called binary search (because we are spliting the list into two parts).

Page 9: 10-binarysearch

COMP103 – 2006/T3 9

Making ArraySet faster.

• Binary Search: Finding “Eel”

• If the items are sorted (“ordered”), then we can search fast

• Look in the middle: if item is middle item ⇒ return if item is before middle item ⇒ look in left half if item is after middle item ⇒ look in right half

0 1 2 3 4 5 6 7 8

Ant Bee Cat Dog Eel Fox Gnu Pig

8 0+7/2 = 3 4+7/2 = 54+4/2 = 4

Page 10: 10-binarysearch

COMP103 – 2006/T3 10

Binary Search

public boolean contains(E item){Comparable<E> value = (Comparable<E>) item;

int low = 0;  // min possible index of itemint high = count-1; // max possible index of item

// item in [low .. high] (if present)while (low <= high){    int mid  =  (low + high) / 2;        if (value.equals(data[mid]) = 0) // item is present         return true;    if (value.compareTo(data[mid]) < 0) // item in [low .. mid-1]         high = mid - 1;    // item in [low .. high]    else  // item in [mid+1 .. high]        low = mid + 1; // item in [low .. high]    } return false;    // item in [low .. high] and low > high,  // therefore item not present                  

}

Page 11: 10-binarysearch

COMP103 – 2006/T3 11

An Alternative

Return the index of where the item ought to be, whether present or not.public int findIndex(Object item){

Comparable<E> value = (Comparable<E>) item;

int low = 0;      int high  =  count;      // index in [low .. high] while (low < high){    int mid  =  (low + high) / 2;    if (value.compareTo(data[mid]) > 0) // index in [mid+1 .. high]         low = mid + 1;          // index in [low .. high]  low <= high    else                    // index in [low .. mid]         high = mid;        // index in [low .. high], low<=high}return low;    // index in [low .. high] and low = high  // therefore index = low

}

Page 12: 10-binarysearch

COMP103 – 2006/T3 12

Exercise

• Calculate all the mid and return values from the following searches using findIndex(…):

0 1 2 3 4 5 6 7 8

Ant Bee Cat Dog Eel Fox Gnu Pig

9

0 1 2 3 4 5 6 7 8

Ant Bee Cat Dog Eel Fox Gnu

7

0 1 2 3 4 5 6 7 8

Ant Bee Cat Dog Eel Fox Gnu Orc

8

Zbu

Find ‘Cat’

Find ‘Bug’

Find ‘Orc’

Page 13: 10-binarysearch

COMP103 – 2006/T3 13

Another Exercise

• Write a recursive binary search below!public int findIndex(E item, int low, int high){

Comparable<E> value = (Comparable<E>)item; if(low >= high)

return low; int mid = (low + high) / 2; if (value.compareTo(data[mid]) > 0)

return recFind(itm, mid + 1, high); else

return recFind(itm, low, mid);

}

Page 14: 10-binarysearch

COMP103 – 2006/T3 14

Binary Search: Cost

• What is the cost of searching if n items in set?• key step = ? (look at the code)

• Iteration Size of range1 n23

k 1

0 1 2 3 4 5 6 7 8 9 2910 11 12 13 3014 15 16 17 18 19 20 21 22 23 24 25 26 27 28 31

n/2n/4

Page 15: 10-binarysearch

COMP103 – 2006/T3 15

Log2(n ) :

The number of times you can divide a set of n things in half.lg(1000) =10, lg(1,000,000) = 20, lg(1,000,000,000) =30Every time you double n, you add one step to the cost!

• Arises all over the place in analysing algorithms “Divide and Conquer” algorithms (easier to solve small problems):

Problem

Solution

Solve Solve

Page 16: 10-binarysearch

COMP103 – 2006/T3 16

ArraySet with Binary Search

ArraySet: unordered

• All cost in the searching: O(n) • contains: O(n ) • add: O(n )• remove: O(n )

ArraySet: with Binary Search

• Binary Search is fast: O(log(n ))• contains: O(log(n )) • add: O(log(n )) O(n )• remove: O(log(n )) O(n )

• All the cost is in keeping it sorted!!!!

Page 17: 10-binarysearch

COMP103 – 2006/T3 17

Making SortedArraySet fast

• If you have to call add() and/or remove() many items,then SortedArraySet is no better than ArraySet (we’ll see how to do better later in the course)• Both O(n )• Either pay to search• Or pay to keep it in order

• If you only have to construct the set once, and then manycalls to contains(),then SortedArraySet is much better than ArraySet.• SortedArraySet contains() is O(log(n )

• But, how do you construct the set fast?• A separate constructor.

Page 18: 10-binarysearch

COMP103 – 2006/T3 18

Alternative Constuctorpublic SortedArraySet(Collection<E> c){

// Make spacecount=c.size();data = (E[]) new Object[count];

// Put collection into a list and sortList<E> temp = new ArrayList<E>(c);Collections.sort(temp, new ComparableComparator());

// Put sorted list into the data arrayint i=0;for (E item : temp)      data[i++] = item;}

• How do you actually sort?  Topic of the next Lecture…