data structures and algorithmsย ยท radix sort input: an array a storing integers, where (i) each...

23
Data Structures and Algorithms (CS210A) Semester I โ€“ 2014-15 Lecture 39 โ€ข Integer sorting : โ€ข Search data structure for integers : 1 Radix Sort Hashing

Upload: others

Post on 09-Jun-2020

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Data Structures and Algorithms (CS210A)

Semester I โ€“ 2014-15

Lecture 39 โ€ข Integer sorting :

โ€ข Search data structure for integers :

1

Radix Sort

Hashing

Page 2: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Types of sorting algorithms

In Place Sorting algorithm:

A sorting algorithm which uses only O(1) extra space to sort.

Example: Heap sort, Quick sort.

Stable Sorting algorithm:

A sorting algorithm which preserves the order of equal keys while sorting.

Example: Merge sort.

2

A

0 1 2 3 4 5 6 7

2 5 3 0 6.1 3 7.9 4

A

0 1 2 3 4 5 6 7

0 2 3 3 4 5 6.1 7.9

Page 3: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Integer Sorting algorithms

Continued from last class

Page 4: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Counting sort: algorithm for sorting integers

Input: An array A storing ๐’ integers in the range [0โ€ฆ๐’Œ โˆ’ ๐Ÿ].

Output: Sorted array A.

Running time: O(๐’ + ๐’Œ) in word RAM model of computation.

Extra space: O(๐’ + ๐’Œ)

Page 5: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Counting sort: a visual description

A

0 1 2 3 4 5 6 7

Count

0 1 2 3 4 5

2 5 3 0 2 3 0 3

2

2 2 4 7 7 8

0 2 3 0 1

Place

0 1 2 3 4 5

B

0 1 2 3 4 5 6 7

3 0

We could have used Count array only to output the elements of A in sorted

order. Why did we compute Place and B ?

Why did we scan elements of A in reverse

order (from index ๐’ โˆ’ ๐Ÿ to ๐ŸŽ) while placing them in the final

sorted array B ?

6 1 Answer: The input might be an array of records and the aim to sort these records according to some integer field.

Answer: To ensure that Counting sort is stable. The reason why stability is required will become clear soon

Page 6: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Counting sort: algorithm for sorting integers

CountSort(A[๐ŸŽ...๐’ โˆ’ ๐Ÿ], ๐’Œ)

For ๐’‹=0 to ๐’Œ โˆ’ ๐Ÿ do Count[๐’‹] 0;

For ๐’Š=0 to ๐’ โˆ’ ๐Ÿ do Count[A[๐’Š]] Count[A[๐’Š]] +1;

For ๐’‹=0 to ๐’Œ โˆ’ ๐Ÿ do Place[๐’‹] Count[๐’‹];

For ๐’‹=1 to ๐’Œ โˆ’ ๐Ÿ do Place[๐’‹] Place[๐’‹ โˆ’ ๐Ÿ] + Count[๐’‹];

For ๐’Š=๐’ โˆ’ ๐Ÿ to ๐ŸŽ do

{ B[ ?? ] A[๐’Š];

Place[A[๐’Š]] Place[A[๐’Š]]-1;

}

return B;

Place[A[๐’Š]]-1

Page 7: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Counting sort: algorithm for sorting integers

Key points of Counting sort:

โ€ข It performs arithmetic operations involving O(log ๐’ + log ๐’Œ) bits

(O(1) time in word RAM).

โ€ข It is a stable sorting algorithm.

Theorem: An array storing ๐’ integers in the range [๐ŸŽ..๐’Œ โˆ’ ๐Ÿ]can be sorted in O(๐’+๐’Œ) time and using total O(๐’+๐’Œ) space in word RAM model.

For ๐’Œ โ‰ค ๐’, we get an optimal algorithm for sorting.

For ๐’Œ = ๐’๐’•, time and space complexity is O(๐’๐’•).

(too bad for ๐’• > ๐Ÿ . )

Question:

How to sort ๐’ integers in the range [๐ŸŽ..๐’๐’•] in O(๐’•๐’) time and using O(๐’) space?

Page 8: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Radix Sort

Page 9: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Digits of an integer

507266

No. of digits = ?

value of digit = ?

1011000101011111

No. of digits = 16

value of digit โˆˆ {0,1}

It is up to us how we define digit ?

โˆˆ {0, โ€ฆ , 9}

6

4

โˆˆ {0, โ€ฆ , 15}

Page 10: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Radix Sort

Input: An array A storing ๐’ integers, where

(i) each integer has exactly ๐’… digits.

(ii) each digit has value < ๐’Œ

(iii) ๐’Œ < ๐’.

Output: Sorted array A.

Running time:

O(๐’…๐’) in word RAM model of computation.

Extra space:

O(๐’ + ๐’Œ)

Important points:

โ€ข makes use of a count sort.

โ€ข Heavily relies on the fact that count sort is a stable sort algorithm.

Page 11: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

2 0 1 2 1 3 8 5 4 9 6 1 5 8 1 0 2 3 7 3 6 2 3 9 9 6 2 4 8 2 9 9 3 4 6 5 7 0 9 8 5 5 0 1 9 2 5 8

5 8 1 0 4 9 6 1 5 5 0 1 2 0 1 2 2 3 7 3 9 6 2 4 1 3 8 5 3 4 6 5 7 0 9 8 9 2 5 8 6 2 3 9 8 2 9 9

Demonstration of Radix Sort through example

A

๐’… =4 ๐’ =12 ๐’Œ =10

Page 12: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

2 0 1 2 1 3 8 5 4 9 6 1 5 8 1 0 2 3 7 3 6 2 3 9 9 6 2 4 8 2 9 9 3 4 6 5 7 0 9 8 5 5 0 1 9 2 5 8

5 8 1 0 4 9 6 1 5 5 0 1 2 0 1 2 2 3 7 3 9 6 2 4 1 3 8 5 3 4 6 5 7 0 9 8 9 2 5 8 6 2 3 9 8 2 9 9

5 5 0 1 5 8 1 0 2 0 1 2 9 6 2 4 6 2 3 9 9 2 5 8 4 9 6 1 3 4 6 5 2 3 7 3 1 3 8 5 7 0 9 8 8 2 9 9

Demonstration of Radix Sort through example

A

Page 13: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

2 0 1 2 1 3 8 5 4 9 6 1 5 8 1 0 2 3 7 3 6 2 3 9 9 6 2 4 8 2 9 9 3 4 6 5 7 0 9 8 5 5 0 1 9 2 5 8

5 8 1 0 4 9 6 1 5 5 0 1 2 0 1 2 2 3 7 3 9 6 2 4 1 3 8 5 3 4 6 5 7 0 9 8 9 2 5 8 6 2 3 9 8 2 9 9

5 5 0 1 5 8 1 0 2 0 1 2 9 6 2 4 6 2 3 9 9 2 5 8 4 9 6 1 3 4 6 5 2 3 7 3 1 3 8 5 7 0 9 8 8 2 9 9

2 0 1 2 7 0 9 8 6 2 3 9 9 2 5 8 8 2 9 9 2 3 7 3 1 3 8 5 3 4 6 5 5 5 0 1 9 6 2 4 5 8 1 0 4 9 6 1

Demonstration of Radix Sort through example

A

Page 14: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

2 0 1 2 1 3 8 5 4 9 6 1 5 8 1 0 2 3 7 3 6 2 3 9 9 6 2 4 8 2 9 9 3 4 6 5 7 0 9 8 5 5 0 1 9 2 5 8

5 8 1 0 4 9 6 1 5 5 0 1 2 0 1 2 2 3 7 3 9 6 2 4 1 3 8 5 3 4 6 5 7 0 9 8 9 2 5 8 6 2 3 9 8 2 9 9

5 5 0 1 5 8 1 0 2 0 1 2 9 6 2 4 6 2 3 9 9 2 5 8 4 9 6 1 3 4 6 5 2 3 7 3 1 3 8 5 7 0 9 8 8 2 9 9

2 0 1 2 7 0 9 8 6 2 3 9 9 2 5 8 8 2 9 9 2 3 7 3 1 3 8 5 3 4 6 5 5 5 0 1 9 6 2 4 5 8 1 0 4 9 6 1

1 3 8 5 2 0 1 2 2 3 7 3 3 4 6 5 4 9 6 1 5 5 0 1 5 8 1 0 6 2 3 9 7 0 9 8 8 2 9 9 9 2 5 8 9 6 2 4

Demonstration of Radix Sort through example

A

Can you see where we are exploiting the fact that Countsort is a stable sorting algorithm ?

Page 15: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Radix Sort

๐’‹

RadixSort(A[๐ŸŽ...๐’ โˆ’ ๐Ÿ], ๐’…, ๐’Œ)

{ For ๐’‹=1 to ๐’… do

Execute CountSort(A,๐’Œ) โ€ฆ

return A;

}

Correctness:

Inductive assertion:

At the end of ๐’‹th iteration, โ€ฆ

๐’‹

๐’… โ€ฆ ๐Ÿ ๐Ÿ

A number stored in A

During the induction step, you will have to use the fact that Countsort is a stable sorting algorithm.

array A is sorted according to the last ๐’‹ digits.

with ๐’‹th digit as the key;

Page 16: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Radix Sort RadixSort(A[๐ŸŽ...๐’ โˆ’ ๐Ÿ], ๐’…, ๐’Œ)

{ For ๐’‹=1 to ๐’… do

Execute CountSort(A,๐’Œ) with ๐’‹th digit as the key;

return A;

}

Time complexity:

โ€ข A single execution of CountSort(A,๐’Œ) runs in O(๐’ + ๐’Œ) time and O(๐’ + ๐’Œ) space.

โ€ข Note ๐’Œ < ๐’,

a single execution of CountSort(A,๐’Œ) runs in O(๐’) time.

Time complexity of radix sort = O(๐’…๐’).

โ€ข Extra space used = ?

Question: How to use Radix sort to sort ๐’ integers in range [๐ŸŽ..๐’๐’•] in O(๐’•๐’) time and O(๐’) space ?

Answer: RadixSort(A[๐ŸŽ...๐’ โˆ’ ๐Ÿ], ๐’•, ๐’)

O(๐’)

๐’… ๐’Œ Time complexity

๐’• log ๐’ ๐Ÿ O(๐’•๐’ log ๐’)

O(๐’•๐’) ๐’ ๐’•

What digit to use ? ๐Ÿ bit

log ๐’ bits

Page 17: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Power of the word RAM model

โ€ข Very fast algorithms for sorting integers:

Example: ๐’ integers in range [๐ŸŽ..๐’๐Ÿ๐ŸŽ] in O(๐’) time and O(๐’) space ?

โ€ข Lesson:

Do not always go after Merge sort and Quick sort when input is integers.

โ€ข Interesting programming exercise (for winter vacation):

Compare Quick sort with Radix sort for sorting long integers.

Page 18: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Data structures for searching

in O(1) time

Page 19: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Motivating Example

Input: a given set ๐‘บ of 1009 positive integers

Aim: Data structure for searching

Example

{

123, 579236, 1072664, 770832456778, 61784523, 100004503210023, 19 ,

762354723763099, 579, 72664, 977083245677001238, 84, 100004503210023,

โ€ฆ

}

Data structure : ?

Searching : ?

Array storing ๐‘บ in sorted order

Binary search

O(log |๐‘บ|) time Can we perform

search in O(๐Ÿ) time ?

Page 20: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Problem Description

๐‘ผ : {0,1, โ€ฆ , ๐’Ž โˆ’ ๐Ÿ} called universe

๐‘บ โŠ† ๐‘ผ,

๐’ = ๐‘บ ,

Aim: A data structure for a given set ๐‘บ that can facilitate searching in O(1) time.

A search query: Does ๐’‹ โˆˆ ๐‘บ ?

Note: ๐’‹ can be any element from ๐‘ผ.

๐’ โ‰ช ๐’Ž

Page 21: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

A trivial data structure for O(1) search time

Build a 0-1 array A of size ๐’Ž such that

A[๐’Š] = 1 if ๐’Š โˆˆ ๐‘บ.

A[๐’Š] = 0 if ๐’Š โˆ‰ ๐‘บ.

Time complexity for searching an element in set ๐‘บ : O(1).

This is a totally Impractical data structure because ๐’ โ‰ช ๐’Ž !

Example: ๐’ = few thousands, ๐’Ž = few trillions.

Question:

Can we have a data structure of O(๐’) size that can answer a search query in O(1) time ?

Answer: Hashing

0 0 1 0 0 โ€ฆ 0 1 0 โ€ฆ 0 0 1

0 1 2 3 4 โ€ฆ โ€ฆ ๐’Ž-1

A

Page 22: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Hash function, hash value, hash table

Hash function:

๐’‰ is a mapping from ๐‘ผ to {0,1, โ€ฆ , ๐’ โˆ’ ๐Ÿ}

with the following characteristics.

โ€ข Space required for ๐’‰ : a few words.

โ€ข ๐’‰(๐’Š) computable in O(1) time in word RAM.

Example: ๐’‰(๐’Š) = ๐’Š mod ๐’

Hash value:

๐’‰(๐’Š) is called hash value of ๐’Š for a given hash function ๐’‰, and ๐’Š โˆˆ ๐‘ผ.

Hash Table:

An array ๐‘ป[0 โ€ฆ ๐’ โˆ’ ๐Ÿ] of โ€ฆ

0 1 2 . . . . . . . . . . .

๐’Ž โˆ’ ๐Ÿ

0 1 . .

๐’ โˆ’ ๐Ÿ

๐’‰

Page 23: Data Structures and Algorithmsย ยท Radix Sort Input: An array A storing integers, where (i) each integer has exactly ๐’… digits. (ii) each digit has value < (iii) < . Output: Sorted

Hash function, hash value, hash table

Hash function:

๐’‰ is a mapping from ๐‘ผ to {0,1, โ€ฆ , ๐’ โˆ’ ๐Ÿ}

with the following characteristics.

โ€ข Space required for ๐’‰ : a few words.

โ€ข ๐’‰(๐’Š) computable in O(1) time in word RAM.

Example: ๐’‰(๐’Š) = ๐’Š mod ๐’

Hash value:

๐’‰(๐’Š) is called hash value of ๐’Š for a given hash function ๐’‰, and ๐’Š โˆˆ ๐‘ผ.

Hash Table:

An array ๐‘ป[0 โ€ฆ ๐’ โˆ’ ๐Ÿ] of โ€ฆ

0 1 2 . . . . . . . . . . .

๐’Ž โˆ’ ๐Ÿ

0 1 . . . . .

๐’ โˆ’ ๐Ÿ

๐’‰

๐‘ป

What should ๐‘ป store? Ponder over itโ€ฆ

We shall discuss it in the next class.