1 arrays 2: sorting and searching. 2 0. admin. §1) no class thursday. §2) will cover strings next...

29
1 Arrays 2: Sorting and Searching

Upload: teresa-price

Post on 12-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

1

Arrays 2: Sorting and Searching

Page 2: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

2

0. Admin.

1) No class Thursday.2) Will cover Strings next Tuesday.3) Take in report.4) Hand out program assignment 6, due 11/20,

based on material covered today.

Page 3: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

3

1. Review of array basics.

What is an array?A contiguous series of homogeneous

elements.What does contiguous mean? Pros and cons?What does homogeneous mean? Pros and cons?

Page 4: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

4

The bounds of an array.

C# defaults to 0-based arrays. If we code float [] Amounts = new float [100];

we get 100 consecutive floats from Amounts[0] to Amounts[99]. It is important to remember this so we avoid a “bounds error”.

Suppose we code Amounts[I] = 83000;What if I = -2 or 100?

Page 5: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

5

Review of the list concept.

The list is the occupied portion of the array.The array is the physical container. It may be

empty, full, or partially full.E.g. Amounts, which can contain 100 elements,

may only be occupied from Amounts[0] to Amounts[56].

The list consists of only the valid data values. This does not include “garbage” that happens to be in memory beyond the list.

Page 6: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

6

Why does the list / array distinction matter?

We should only process the valid data (list).Otherwise we may e.g. include garbage

values in the Total of all entries, update both valid and garbage data, sort garbage into valid data and search in the garbage!

This is computer science, not the FBI!

Page 7: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

7

List processing.

While it is proper to talk of loading an array, thereafter it is helpful to speak of processing the list:

We should ensure that our algorithm only processes the valid data values, e.g. from Amounts[0] to Amounts[56].

Page 8: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

8

2. Sorting lists.

To sort information means to arrange it in some order.

It may be a single item, arranged in ascending or descending order e.g. opening weekend profits for movies from best to worst.

Usually it is a collection of data (a record) sorted in order of a key field.

Page 9: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

9

Why sort lists?

1) Shows highest / lowest values at top / bottom.2) May discover important groupings e.g. a lot of

people called Patel, Smith or Slarteebartfast.3) Easier to search.Why is a dictionary easier to search than a random

listing of words and their definitions?

Page 10: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

10

The Bubble Sort.

Idea:Repeatedly pass through a list, comparing

adjacent elements.If the elements are out of order, swap.Large values “bubble”, one swap at a time,

to the end of the list.

Page 11: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

11

Example.

Pass 1:E B B BB E E EF C C CC F D DD D F AA A A F

Pass 2:B B BC C CE D DD E AA A EF F F

Page 12: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

12

Example.

Pass 3: Pass 4:

B B

C A

A C

D D

E E

F F

Pass 5: A B C D E F

Page 13: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

13

Why is it called the “bubble” sort?

After the first pass, it is guaranteed that the largest element (F) is sorted.

After the second pass, it is guaranteed that the second largest element (E) is sorted.

So, if we aren’t lucky, the maximum number of passes for N elements is….

Page 14: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

14

Why is it called the “bubble” sort?

N-1.Why?If all the elements larger than the smallest

are correctly placed, the smallest element must already be correctly placed.

What about the maximum number of comparisons within one pass, comparing each adjacent pair?

Page 15: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

15

Why is it called the “bubble” sort?

Also N-1. Why?So in worse case, if there are N elements,

takes N-1 * N-1 or O (N2) comparisons and swaps to sort N elements.

However, the algorithm can be optimized.(1) If the list is sorted in fewer than N-1

passes, how could we know?

Page 16: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

16

Optimization.

Use a boolean Swap flag: set to false before each pass; only set to true if a swap occurs.

So, if it stays false, there were no swaps, so the list must be sorted already.

(2) Do we always need to compare every adjacent pair?

No, because some parts are already sorted.

Page 17: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

17

Optimization.

Can continue to do comparisons up until the last element that was swapped, e.g.

B B C C A A E D } Last D E } Change F F

Page 18: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

18

See BUBSORT.CS

This code shows an optimized Bubble Sort (which is a bit like saying a Turbo tortoise).

Note how arrays are passed as parameters.Note how the Load function keeps track of

the size of the list.Look at the Bubble Sort logic.

Page 19: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

19

Logic of Bubble Sort.

Note the use of a nested loop.The outer loop controls the passes, running

until the first pass with no swaps.The inner loop controls the comparisons

within each pass, running until the last change made on the previous pass.

Note the Swap logic. Why are Temporary variables needed?

Page 20: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

20

Swapping.

Suppose we want to swap E and B.We can’t say move E to B, then B to E,

because the first move destroys B.So?1) Move E to Temp.2) Move B to E.3) Move Temp to B.

Page 21: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

21

Limitation of the Bubble Sort.

The Bubble Sort is slow.But why? Can only swap adjacent elements.Consider the worst case scenario: Index Value [0] 999 . . [999] 0

Page 22: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

22

What is the problem?

It will take 999 comparisons and swaps to place the Value 999 at the correct location [999]. Shuffling is very slow.

What would help solve this problem?

Page 23: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

23

The Shell Sort.

Donald Shell suggested that we compare elements separated by a larger gap.

Start with a gap half the size of the list. Keep passing through until there is a pass with no swaps, then halve the gap.

Why does this help?See diagram and Shell sort code.

Page 24: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

24

Efficiency.

Studies find no significant difference between the Shell Sort and Bubble Sort for small lists e.g. <= 10 elements.

But for 100 / 1000 / 10000, increasing advantage to use Shell Sort.

The Shell Sort is estimated to be anO (N1.25) algorithm.

Page 25: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

25

Searching Lists.

If a list is unsorted, we are forced to do a serial or linear search:

I = 0;Found = false;while (!Found && I < ListSize) { if (SearchElement != List[I]) I++; else Found = true;} // end while

Page 26: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

26

But this can be tedious…

A serial search is O (N), like loading an array.

Only Forrest Gump would search a dictionary this way!

Why is this silly? How would we search a dictionary?

Page 27: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

27

The Binary Search (for sorted lists).

Use 3 indexes Start, Stop and Middle.Compare SearchElement to Element[Middle] if (SearchElement < List[Middle]) Stop = Middle -1;else if (SearchElement > List[Middle] ) Start = Middle + 1else Found = true;

Page 28: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

28

More technically: Start = 0; Stop = ListLength-1; Found = false; while ((Start <= Stop) && (!Found)) { Middle = (Start + Stop) / 2; if (SearchElement < List[Middle]) Stop = Middle -1; else if (SearchElement > List[Middle]) Start = Middle + 1; else { Found = true; Location = Middle } } // end while

Page 29: 1 Arrays 2: Sorting and Searching. 2 0. Admin. §1) No class Thursday. §2) Will cover Strings next Tuesday. §3) Take in report. §4) Hand out program assignment

29

Example:

A B C D E F G [0] [1] [2] [3] [4] [5] [6] Start Middle Stop

E.g. Find ‘B’

E.g. Find ‘E’.