ics 220 – data structures and algorithms dr. ken cosh lecture 9

42
ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Upload: neal-rice

Post on 11-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

ICS 220 – Data Structures and Algorithms

Dr. Ken Cosh

Lecture 9

Page 2: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Review

• Graphs– Types– Representations– Traversal– Shortest Path

• Dijkstra’s Algorithm

– Minimum Spanning Trees– Networks– Eulerian Graphs– Hamiltonian Graphs– The Travelling Salesman

Page 3: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

This week

• Sorting Algorithms– The efficiency of many algorithms depends on

the way data is stored.– Most data structures can be improved by

ordering the data structure in some way.– This week we will investigate different

approaches to sorting data.

Page 4: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Sorting Criteria

• The first step is to decide what to sort by;– numerical– alphabetical– ASCII codes

Page 5: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Algorithm Measures

• Best Case– Often when data already sorted

• Worst Case– Data completely disorganised, or in reverse order.

• Average Case– Random order

• Some sorting algorithms are the same for all three cases, others can be tailored to suit certain cases.

Page 6: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Sorting Algorithms

• Sorting Algorithms are generally comprised of ‘comparisons’ and ‘movements’.

• If the data being sorted is complicated (strings for example), then the comparison part can be more expensive.

• If the data items are large, then the movement part can be more expensive.

Page 7: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Elementary Sorting Algorithms

• First lets investigate 3 elementary sorting algorithms;– Insertion Sort– Selection Sort– Bubble Sort

Page 8: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Insertion Sort

• Insertion sort starts with the first two elements, and compares them;– a[0] < a[1] ?

• If necessary it will swap them.

• Next the 3rd element is compared and positioned, and then the 4th etc.

Page 9: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Insertion Sort

template<class T>void insertionsort(T data[], int n) {

for (int i = 1, j; i < n; i++) {T tmp = data[i];for (j=i; j<0 && tmp < data[j-1]; j--)

data[j] = data[j-1];data[j] = tmp;

}}

Page 10: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Insertion Sort

• The insertion method only sorts the part of the array that needs to be sorted – if the array is already sorted, less processing is required.– Best Case when data is already in order, 1

comparison per position (n-1), or O(n).– Worst Case is when data is in reverse order and ever,

in which case comparisons are required for both the outer and the inner loop, O(n2)

– How about Average Case? The result should be somewhere between O(n) and O(n2), but which is it closer to?

Page 11: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Insertion Sort – Average Case

• The number of iterations of the outer loop depends on how far the number is away from where it should be;– it could loop 0 times if it is where it should be, or i

times if it needs to be inserted at the start.– Therefore on average it should be;

0+1+2+…+i / i = (i+1)/2– This is the average per case, which needs to be

multiplied by the number of numbers (n), leaving O(n2), sadly closer to the worst case than best.

Page 12: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Selection Sort

• Selection Sort takes each value and moves it to its final place.– i.e. it first searches for the smallest value and

moves it to the start, then the next smallest and moves that to position 2.

Page 13: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Selection Sort

template<class T>void selectionsort(T data[], int n) {

for (int i = 0,j,least; i<n-1; i++) {for (j = i+1, least = i; j < n; j++)

if (data[j] < data[least])least = j;

swap(data[least], data[i]);}

}• Once again this algorithm relies on a double loop, so

O(n2)

Page 14: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

BubbleSort

• Imagine a glass of beer…

• The bubbles slowly rise to the surface.

• Now imagine those bubbles are the data to be sorted, with the lowest values rising to the surface.

• That’s how Bubblesort works!

Page 15: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

BubbleSort

• Starting with the other end of the dataset, a[n-1], each value is compared with its predecessor, a[n-2], and if it is less it is swapped.

• Next a[n-2] is compared with a[n-3], and if necessary swapped, and so on.

• Therefore at the end of the first loop through the array, the smallest value is in position a[0], but many of the other values have been bubbled towards their final location.

• The next loop through will move values closer still.

Page 16: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

BubbleSort

template<class T>

void bubblesort(T data[], int n) {

for (int i=0; i<n-1; i++)

for (int j=n-1; j>I; --j)

if (data[j] < data[j-1])

swap(data[j], data[j-1]);

}

Page 17: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

BubbleSort

• With Bubble Sort there are the same number of comparisons in every case (best, worst and average)

• Sadly that number of comparisons is;

(n(n-1))/2, or O(n2)

Page 18: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Cocktail Sort

• A variation of the BubbleSort is called Cocktail Sort.

• Here the bubbles move in both directions alternatingly. As though a cocktail was being shaken. First lower values are bubbled towards the start, then higher values are bubbled towards the end.

Page 19: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Elementary Sorting Algorithms

• Sorry… but O(n2) just isn’t good enough anymore!

Page 20: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Efficient Sorting Algorithms

• Shell Sort

• Heap Sort

• Quick Sort

• Merge Sort

• Radix Sort

Page 21: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

ShellSort

• One problem with the basic sorting algorithms is that their runtime grows faster than the size of the array.

• Therefore one way to solve this problem is to perform the sort on smaller / limited subarray parts of the array.

• ShellSort (named after Donald Shell), breaks the array into subarrays, sorts them, and then sorts the whole array.

Page 22: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

ShellSort

• Shellsort is actually slightly more complex than that, because decisions need to be made about the size of each subarray.

• Normally the Sort is performed on a decreasing number of sub arrays, with the final sort being performed on the whole array (the sub arrays growing in size each Sort).

• Note that the actual Sort part can use any of the Elementary Sorting Algorithms already discussed.

Page 23: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Shell Sort Sample

• Original Data Set10, 15, 2, 9, 12, 4, 19, 5, 8, 0, 11

• 5 subarray Sort10, 4, 11 4,10,11

15, 19 15, 19

2, 5 2, 5

9, 8 8, 9

12, 0 0, 12

• After 5 subarray Sort

4, 15, 2, 8, 0, 10, 19, 5, 9, 12, 11

Page 24: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Shell Sort Sample cont.

4, 15, 2, 8, 0, 10, 19, 5, 9, 12, 11

• 3 subarray Sort4, 8, 19, 12 4, 8, 12, 1915, 0, 5, 11 0, 5, 11, 152, 10, 9 2, 9, 10

• After 3 subarray Sort4, 0, 2, 8, 5, 9, 12, 11, 10, 19, 15

• Final Sort 0, 2, 4, 5, 8, 9, 10, 11, 12, 15, 19

Page 25: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Shell Sort

• In the Sample, I arbitrarily chose sub array sizes of 1, 3 and 5.

• In reality that increment isn’t ideal, but then it isn’t clear what increment is.

• From empirical studies, a suggested increment follows this pattern;– h1 = 1– hi+1 = 3hi+1– until ht+2 ≥n

• What sequence will this give?

Page 26: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Shell Sort

• Shell Sort is shown to be better than O(n2), but how much depends on the increments.

• With Shell Sort different Elementary Sorting Algorithms can affect the efficiency of the ShellSort, but its not agreed on how.

• A favourite is surely the “Cocktail Shaker Sort”, which uses iterations of the bubble sort, finishing off with a insertion sort – as the cherry is inserted into the cocktail!

Page 27: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Heap Sort

• To investigate how Heap Sort works, we need to investigate a Heap further.

• A heap is like a binary tree, with 2 different properties;– First the value of each node is not less than

the values in its children (i.e. the highest value is the root).

– Second the tree is perfectly balanced.

Page 28: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Heap Sort

• Heap Sort makes use of a heap, and is a variation of Selection Sort.

• The weakness of selection Sort was the number of comparisons required – because of the improved searchability of a tree, heap sort reduces the comparisons made during selection sort.

• First a heap is created, then the root is moved to the end of the dataset, before a new heap is created.

• Therefore the complexity of the algorithm mirrors the complexity of creating a heap.

Page 29: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Heap Sample

15

12 6

211 10 3

81

Remove root node, and make new heap from 2 heaps

Page 30: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Heap Sort

• Rather than moving the root to a new array, heapsort uses the same memory space and swaps the root value with the last leaf node (8).

• 8 is then be moved down the tree, swapping with the larger child until it reaches its new base (i.e. swap with 12 and then 11).

• The new root can then be swapped with the next lowest node (1).

• And the process repeated, which in the worst case is O(n lg n).

Page 31: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

QuickSort

• Quicksort uses a recursive approach, which is similar to the approach we used when guessing a number between 1 and 10.

• Quicksort shares with Shellsort the ideal of breaking the problem down into subproblems that are easier to solve.

• Quicksort divides the values into 2 subarrays – those lower than value ‘x’ and those higher than value ‘x’.

• The 2 subarrays can then be sorted – guess what… using QuickSort.

Page 32: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

QuickSort

• The problem remains which value to chose as the pivot or bound value?

• Obviously if we choose a value too near to either end of the dataset we will have an unbalanced split.

• We could arbitrarily choose the first value, but some arrays are already partly sorted, therefore we could arbitrarily choose the middle value.

Page 33: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Quick Sort

{10, 15, 2, 9, 12, 4, 19, 5, 8, 0, 11}

{2, 0} 4 {10, 15, 9, 12, 19, 5, 8, 11}

{0, 2} 4 {10, 9, 5, 8, 11} 12 {15, 19}

0, 2, 4, {}5{10, 9, 8, 11} 12 {15,19}

0, 2, 4, 5 {8}9{10, 11} 12, 15, 19

1, 2, 4, 5, 8, 9, 10, 11, 12, 15, 19

Page 34: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

QuickSort Pseudocode

Quicksort(array[])if length(array)>1

choose bound;while there are elements in array

include element in either subarray1 or subarray2;

quicksort(subarray1);quicksort(subarray2);

Page 35: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

QuickSort

• Best Case – when the array divides into even size parts;n + 2n/2 + 4n/4 + 8n/8 + … …+nn/n = O(n lg n)

• Worst Case – when one array is always size 1;– O(n2)

• Average case is in between. Ideally if we can avoid the worst case, but to do that we need to pick a good pivot value;– A better approach is to choose the median of the first,

last and middle values

Page 36: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

MergeSort

• The problem with QuickSort is still picking a good pivot value.

• MergeSort combats this by concentrating on merging two already sorted subarrays.

• The merge process can be quite straight forward assuming the two subarrays are already sorted.

• We can sort the subarrays recursively using MergeSort!

Page 37: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Merge Sort

Mergesort(data)

If data has at least 2 elements

Mergesort(left half);

Mergesort(right half);

merge(both halves);

Page 38: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

merge

merge(array1, array2, array3)

int i1 = 0, i2 = 0, i3 = 0;

while both array2 and array 3 have elements

if array2[i2] < array3[i3]

array1[i1++] = array2[i2++];

else array1[i1++] = array3[i3++];

add the remains of either array2 or array3 into array1;

Page 39: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

MergeSort

{10, 15, 2, 9, 12, 4, 19, 5, 8, 0, 11}

{10, 15, 2, 9, 12, 4} {19, 5, 8, 0, 11}

{10, 15, 2}{9, 12, 4} {19, 5, 8}{0, 11}

{10,15}{2}{9,12}{4} {19,5}{8}{0,11}

{2, 10, 15}{4, 9, 12} {5, 8, 19}{0, 11}

{2, 4, 9, 10, 12, 15} {0, 5, 8, 11, 19}

{0, 2, 4, 5, 8, 9, 10, 11, 12, 15, 19}

Page 40: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

MergeSort

• The MergeSort algorithm is O(n lg n), but has a major drawback due to the space required to store multiple arrays (in the run time stack).

• An iterative version while visually more complex is a little more efficient space wise.

• Also using a linked list rather than an array can counter the space issue.

Page 41: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Radix Sorting

• Radix Sorting mirrors a real life approach to sorting.

• Given a selection of books, we might first make a pile of all the ‘A’ author books, all the ‘B’ author books etc. Then we would sort each pile.

• Or, if given a series of 10 digit numbers, we might first sort according to the first digit, and then according to the second digit etc.

Page 42: ICS 220 – Data Structures and Algorithms Dr. Ken Cosh Lecture 9

Comparing approaches

• Take a look at the table on page 511 to see how the different sorting algorithms compare.