Download - 10-binarysearch
Searching
COMP 103 #10
COMP103 – 2006/T3 2
Menu
• Cost of ArraySet
• Simple Searching
• Binary Search
• Assignment 4 – Due Next Wednesday
COMP103 – 2006/T3 COMP 103 11:3
Cost of ArraySet
• ArraySet uses same data structure as ArrayList• does not have to keep items in order
• Operations are:• contains(item)• add(item) ← always add at the end• remove(item) ← don’t need to shift down – just move last item
down
• What are the costs?
• contains:
• remove:
• add:
n
COMP103 – 2006/T3 4
ArrayList/ArraySet costs
ArrayList (duplicates permitted):
• get, set, O(1)
• remove, add (at i) O(n)
• add (at end) O(1) (average) O(n) (worst)
ArraySet (no duplicates):
• When adding, we need to check the element does not exist already.
• contains, add, remove: O(n)
• All additional cost in the searching.
COMP103 – 2006/T3 5
Duplicate Detection
Add ‘Dog’
Bee Dog Ant Fox Gnu Eel Cat
7
Bee Dog Ant Fox Gnu Eel Cat
7
Pig
8
Add ‘Pig’
Question:
• How can we speed up the search?
COMP103 – 2006/T3 6
How else can we search?
• Phone Book • Data is ordered, so it is• Faster to find items, because…• We know roughly how far to look
Ant Bee Cat Dog Eel Fox Gnu Pig
8
Add ‘Cat’
COMP103 – 2006/T3 7
An improvement, but
• This is still an O(n) search.• Think about how you search a phone book…• Exercise: Write down where your eyes scan
when looking for G & W in the list:
AAABBBCCCDDDDEEEEFFFGGGHHHIIIJJJJJJJJJKKKKKKLLLLLMMMMMNNOOOPPQQRRRSSSTTTTTUUVVVVVWXXXYZZZ
COMP103 – 2006/T3 8
A better strategy…
• We can model a search pattern on the human search pattern.
• Idea:• Look at the middle.• is it before or after?• if before, look at the middle of the first half,• if after, look at the middle of the second half.
• This is called binary search (because we are spliting the list into two parts).
COMP103 – 2006/T3 9
Making ArraySet faster.
• Binary Search: Finding “Eel”
• If the items are sorted (“ordered”), then we can search fast
• Look in the middle: if item is middle item ⇒ return if item is before middle item ⇒ look in left half if item is after middle item ⇒ look in right half
0 1 2 3 4 5 6 7 8
Ant Bee Cat Dog Eel Fox Gnu Pig
8 0+7/2 = 3 4+7/2 = 54+4/2 = 4
COMP103 – 2006/T3 10
Binary Search
public boolean contains(E item){Comparable<E> value = (Comparable<E>) item;
int low = 0; // min possible index of itemint high = count-1; // max possible index of item
// item in [low .. high] (if present)while (low <= high){ int mid = (low + high) / 2; if (value.equals(data[mid]) = 0) // item is present return true; if (value.compareTo(data[mid]) < 0) // item in [low .. mid-1] high = mid - 1; // item in [low .. high] else // item in [mid+1 .. high] low = mid + 1; // item in [low .. high] } return false; // item in [low .. high] and low > high, // therefore item not present
}
COMP103 – 2006/T3 11
An Alternative
Return the index of where the item ought to be, whether present or not.public int findIndex(Object item){
Comparable<E> value = (Comparable<E>) item;
int low = 0; int high = count; // index in [low .. high] while (low < high){ int mid = (low + high) / 2; if (value.compareTo(data[mid]) > 0) // index in [mid+1 .. high] low = mid + 1; // index in [low .. high] low <= high else // index in [low .. mid] high = mid; // index in [low .. high], low<=high}return low; // index in [low .. high] and low = high // therefore index = low
}
COMP103 – 2006/T3 12
Exercise
• Calculate all the mid and return values from the following searches using findIndex(…):
0 1 2 3 4 5 6 7 8
Ant Bee Cat Dog Eel Fox Gnu Pig
9
0 1 2 3 4 5 6 7 8
Ant Bee Cat Dog Eel Fox Gnu
7
0 1 2 3 4 5 6 7 8
Ant Bee Cat Dog Eel Fox Gnu Orc
8
Zbu
Find ‘Cat’
Find ‘Bug’
Find ‘Orc’
COMP103 – 2006/T3 13
Another Exercise
• Write a recursive binary search below!public int findIndex(E item, int low, int high){
Comparable<E> value = (Comparable<E>)item; if(low >= high)
return low; int mid = (low + high) / 2; if (value.compareTo(data[mid]) > 0)
return recFind(itm, mid + 1, high); else
return recFind(itm, low, mid);
}
COMP103 – 2006/T3 14
Binary Search: Cost
• What is the cost of searching if n items in set?• key step = ? (look at the code)
• Iteration Size of range1 n23
k 1
0 1 2 3 4 5 6 7 8 9 2910 11 12 13 3014 15 16 17 18 19 20 21 22 23 24 25 26 27 28 31
n/2n/4
COMP103 – 2006/T3 15
Log2(n ) :
The number of times you can divide a set of n things in half.lg(1000) =10, lg(1,000,000) = 20, lg(1,000,000,000) =30Every time you double n, you add one step to the cost!
• Arises all over the place in analysing algorithms “Divide and Conquer” algorithms (easier to solve small problems):
Problem
Solution
Solve Solve
COMP103 – 2006/T3 16
ArraySet with Binary Search
ArraySet: unordered
• All cost in the searching: O(n) • contains: O(n ) • add: O(n )• remove: O(n )
ArraySet: with Binary Search
• Binary Search is fast: O(log(n ))• contains: O(log(n )) • add: O(log(n )) O(n )• remove: O(log(n )) O(n )
• All the cost is in keeping it sorted!!!!
COMP103 – 2006/T3 17
Making SortedArraySet fast
• If you have to call add() and/or remove() many items,then SortedArraySet is no better than ArraySet (we’ll see how to do better later in the course)• Both O(n )• Either pay to search• Or pay to keep it in order
• If you only have to construct the set once, and then manycalls to contains(),then SortedArraySet is much better than ArraySet.• SortedArraySet contains() is O(log(n )
• But, how do you construct the set fast?• A separate constructor.
COMP103 – 2006/T3 18
Alternative Constuctorpublic SortedArraySet(Collection<E> c){
// Make spacecount=c.size();data = (E[]) new Object[count];
// Put collection into a list and sortList<E> temp = new ArrayList<E>(c);Collections.sort(temp, new ComparableComparator());
// Put sorted list into the data arrayint i=0;for (E item : temp) data[i++] = item;}
• How do you actually sort? Topic of the next Lecture…