1 jeff edmonds york university cosc 2011 lecture 6 balanced trees dictionary/map adt binary search...

134
1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing AVL Trees Union-Find Partition Heaps & Priority Queues Communication & Hoffman Codes (Splay Trees )

Upload: marilynn-carson

Post on 31-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

1

Jeff Edmonds

York University COSC 2011Lecture 6

Balanced Trees

Dictionary/Map ADTBinary Search TreesInsertions and DeletionsAVL TreesRebalancing AVL TreesUnion-Find PartitionHeaps & Priority QueuesCommunication & Hoffman Codes(Splay Trees)

Page 2: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

2

Dictionary/Map ADT

Problem: Store value/data associated with keys.

Inputkey, value

k1,v1k2,v2k3,v3k4,v4

Examples:• key = word, value = definition• key = social insurance number

value = person’s data

Page 3: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

3

Dictionary/Map ADT

Problem: Store value/data associated with keys.

Inputkey, value

k1,v1k2,v2k3,v3k4,v4

Insert SearchUnordered Array O(n)O(1)Implementations:

Array

0

1

2

3

4

5

6

7

k5,v5

Page 4: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

4

Dictionary/Map ADT

Problem: Store value/data associated with keys.

Inputkey, value

2,v34,v47,v19,v2

Insert Search

Ordered ArrayUnordered Array

O(n)O(n)O(1)O(logn)

Implementations:

Array

0

1

2

3

4

5

6

7

6,v5

Page 5: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

5

trailerheader

nodes/positions

entries

Dictionary/Map ADT

Problem: Store value/data associated with keys.

Inputkey, value

2,v34,v47,v19,v2

Insert Search

Ordered ArrayOrdered Linked List

Unordered ArrayO(n)O(n)

O(n)O(1)O(logn)

Implementations:

O(n)

6,v5

Inserting is O(1) if you have the spot.but O(n) to find the spot.

Page 6: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

6

Dictionary/Map ADT

Problem: Store value/data associated with keys.

Inputkey, value

2,v34,v47,v19,v2

Insert Search

Ordered ArrayUnordered Array

Binary Search Tree

O(n)O(n)

O(logn)

O(1)

O(logn)

O(logn)

Implementations:

38

25

17

4 21

31

28 35

51

42

40 49

63

55 71

Page 7: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

7

Dictionary/Map ADT

Problem: Store value/data associated with keys.

Inputkey, value

2,v34,v47,v19,v2

Insert Search

Ordered ArrayUnordered Array

Binary Search Tree

O(n)O(n)

O(logn)

O(1)

O(logn)

O(logn)

Heaps Faster: O(logn) O(n)

Implementations:

Max O(1)

Heaps are good for Priority Queues.

Page 8: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

8

Dictionary/Map ADT

Problem: Store value/data associated with keys.

Inputkey, value

2,v34,v47,v19,v2

Hash Tables Avg: O(1) O(1)

Next

O(1) (Avg)

O(1)

O(n)

Hash Tables are very fast,but keys have no order.

Insert Search

Ordered ArrayUnordered Array

Binary Search Tree

O(n)O(n)

O(logn)

O(1)

O(logn)

O(logn)

Heaps Faster: O(logn)

Implementations:

Max O(1)

Page 9: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

9

Unsorted List

SortedList

Balanced Trees

Splay Trees

Heap Hash Tables

•Search

•Insert•Delete

•Find Max

•Find Next in Order

O(log(n))

Balanced Trees

O(n)

O(1) O(n)

O(1)

O(1)

O(n)

O(n)

O(log(n))

O(log(n))

O(log(n))

O(log(n))

O(1)

O(n)Amortized

O(1)

(Priority Queue)

O(1)

O(n)

O(n)

Worst case O(n)

Practice better

O(log(n))better

(Dictionary)

O(n)

(Static)

Page 10: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

I learned AVL trees from slides from

Andy Mirzaian and James Elderand then reworked them

Page 11: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

From binary search toBinary Search Trees

11

Page 12: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

38

25

17

4 21

31

28 35

51

42

40 49

63

55 71

Binary Search Tree

All nodes in left subtree ≤ Any node ≤ All nodes in right subtree

≤≤ ≤

Page 13: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

key 17

38

25

17

4 21

31

28 35

51

42

40 49

63

55 71

Algorithm TreeSearch(k, v)v = T.root()

loop if T.isExternal (v)

return “not there” if k < key(v) v = T.left(v) else if k = key(v) return v else { k > key(v) }

v = T.right(v)end loop

Move down the tree. Loop Invariant: If the key is contained in the original tree, then the key is contained in the sub-tree rooted at the current node.

Iterative Algorithm

Page 14: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

key 17

38

25

17

4 21

31

28 35

51

42

40 49

63

55 71

Recursive Algorithm If the key is not at the root, ask a friend to look for it in the

appropriate subtree.

Algorithm TreeSearch(k, v)if T.isExternal (v)

return “not there”if k < key(v)

return TreeSearch(k, T.left(v))else if k = key(v)

return velse { k > key(v) }

return TreeSearch(k, T.right(v))

Page 15: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

3

1

11

9 12

8

v

w

2

10

6

5 7

4

Insertions/Deletions

To insert(key, data):

We search for key.

Not being there, we end up in an empty tree.

Insert the key there.

Insert 10

Page 16: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

3

1

11

9 12

8

v

w

2

10>

6

5 7

4

Insertions/Deletions

To Delete(keydel, data):

If it does not have two children,

point its one child at its parent.

Delete 4

keydel

Page 17: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

3

1

11

9 12

8w

2

10>

6

5 7

4

Insertions/Deletions

To Delete(keydel, data):

else find the next keynext in order

right left left left …. to empty tree

Replace keydel to delete with keynext

point keynext’s one child at its parent.

Delete 3

keynext

keydel

Page 18: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Performance find, insert and remove take O(height) time

In a balanced tree, the height is O(log n)

In the worst case, it is O(n)

Thus it worthwhile to balance the tree (next topic)!

Page 19: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

The AVL tree is the first balanced binary search tree ever invented.

It is named after its two inventors, G.M. Adelson-Velskii and E.M. Landis, who published it in their 1962 paper "An algorithm for the organization of information.”

AVL Trees

Page 20: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

AVL Trees AVL trees are “mostly” balanced. Tree is said to be an AVL Tree if and only if

heights of siblings differ by at most 1. height(v) = height of the subtree rooted at v. balanceFactor(v) = height(rightChild(v)) -

height(leftChild(v)) Tree is said to be an AVL Tree if and only if

v balanceFactor(v) { -1,0,1 }.

88

44

17 78

32 50

48 62

0

0

0

+1

1

0

0-1=

01

2subtreeheight 3

subtreeheight

balanceFactor = 2-3 = -1

-1

Page 21: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Height of an AVL Tree Claim: The height of an AVL tree storing n keys is ≤ O(log n).

Proof: Let N(h) be the minimum the # of nodes of an AVL tree of height h.

Observe that N(0) = 0 () and N(1) = 1 ( ) For h ≥ 2, the minimal AVL tree contains

the root node,

one minimal AVL subtree of height h – 1,

another of height h - 2.

That is, N(h) = 1 + N(h - 1) + N(h - 2)

> N(h - 1) + N(h - 2)

= Fibonacci(h) ≈ 1.62h.

n ≥ 1.62h

h ≤ log(n)/log(1.62) = 4.78 log(n)

Thus the height of an AVL tree is O(log n)

balanceFactor ≤ 1 = (h-1)-(h-2)height = h

At least one of its subtrees has height

h-1 h-2

Page 22: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Rebalancing

Changes heights/balanceFactorsSubtree [..,5] raises up oneSubtree [5,10] height does not changeSubtree [10,..] lowers one

Does not change Binary Tree Ordering.

Page 23: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

T1

T2

T3

Rebalancing after an Insertion

Inserting new leaf 2 in to AVL tree may create an imbalance.

7

4

3

8

5

Problem!

2

rotateR(7)

7

4

T1

T2 T3

3

2 5 8

No longer an AVL Tree

3subtreeheight

balanceFactor = 3-1 = 2

1

Rebalanced into an AVL tree again.

2 2

2-2 = 0

1

1-1 = 0

1

Page 24: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

T2

T1

T3

Rebalancing after an Insertion

7

4

3

8

5

rotateR(7)

7

4

balanceFactor = 3-1 = 2

Try another example. Inserting new leaf 6 in to AVL tree

6

Problem! 3

T 0

8

T2

5

6

T1

T3

31

1-3 = -2

Oops!Not an AVL Tree.

Page 25: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

T2

T1

T3

Rebalancing after an Insertion

7

4

3

8

5

rotateR(7)

7

4

balanceFactor {-2,+2}

6

Problem! 3

T 0

8

T2

5

6

T1

T3

There are 6 cases. Two are easier

balanceFactor {-1,0,+1}

Page 26: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

There are 6 cases. Two are easier

Half are symmetrically the same.

This leaves two.

Rebalancing after an Insertion

x

z

y

height = h

T0 T1

T2

T3

h-1 h-3

h-2

one is h-3 & one is h-4

h-3 x

z

y

height = h

T1 T2

T0

T3

h-1 h-3

h-2

one is h-3 & one is h-4

h-3

+2

+1

+2

-1

x

z

y

height = h

T3T2

T1

T0

h-1h-3

h-2

one is h-3 & one is h-4

h-3 x

z

y

height = h

T2T1

T3

T0

h-1h-3

h-2

one is h-3 & one is h-4

h-3

-2

-1

-2

+1

Page 27: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Rebalancing after an Insertion

Inserting new leaf 2 in to AVL tree may create an imbalance in path from leaf to root.

7

4

3

8

5

Problem!

Increases heights along path from leaf to root.

2 balanceFactor0

+1

+1

+2

Page 28: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Rebalancing after an Insertion

The repair strategy called trinode restructuring

7

4

3

8

5

Problem!

2

+2

Denote

z = the lowest imbalanced node

y = the child of z with highest subtree

x = the child of y with highest subtree

Page 29: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

T1

T2

T3

Rebalancing after an Insertion

z

y

x

h-1

The repair strategy called trinode restructuring

y = the child of z with highest subtree

Defn: h = height

At least one of its subtrees has height h-1

Page 30: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

T1

T2

T3

Rebalancing after an Insertion

z

y

x

h-1 h-3

h-2 h-3

The repair strategy called trinode restructuring

+2balanceFactor

+1

Assume

z = the lowest imbalanced node

x = the child of y with highest subtree

Defn: h = height

balanceFactor {-1,0,1} but only got one worse with insertionbalanceFactor {-2,2}By way of symmetry assume 2.

balanceFactor {-1,0,1}Cases:balanceFactor = +1 we do now.balanceFactor = -1 we will do later.balanceFactor = 0 is the same as -1.

balanceFactor

Page 31: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

T1

T2

T3

Rebalancing after an Insertion

z

y

x

h-3

h-2 h-3

The repair strategy called trinode restructuring

This subtree is balanced.

rotateR(z)

h-2

T2T3

T1

x z

y

T1 ≤ y ≤ T2 ≤ z ≤ T3

y ≤ z

h-3

h-2

h-3

h-1

Page 32: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

T1

T2

T3

Rebalancing after an Insertion

z

y

x

h

Is the rest of the tree sufficiently balanced to make it an AVL tree? • Before the insert it was.• Insert made this subtree one higher.• Our restructuring made it • Hence the whole tree is an AVL Tree

h-1

T2T3

T1

x z

y

Rest of Tree Rest of Tree

This subtree is balanced.

back to the original height.

Page 33: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Rebalancing after an Insertion

Try another example. Inserting new leaf 6 in to AVL tree

7

4

3

8

5

6

Page 34: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

T2

T1

T3

Rebalancing after an Insertion

z

y

3

8

5

6

rotateR(z)

3 z

y

T 0

8

T2

5

6

T1

T3

balanceFactor = 1-3 = -2

Oops!Not an AVL Tree.

Try another example. Inserting new leaf 6 in to AVL tree

Page 35: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Rebalancing after an Insertion

7

4

3

8

5

Problem!

Increases heights along path from leaf to root.

6 balanceFactor0

-1

+2

-1

Try another example. Inserting new leaf 6 in to AVL tree

Page 36: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Rebalancing after an Insertion

7

4

3

8

5

Problem!

6

-1

+2

-1

The repair strategy called trinode restructuring

Denote

z = the lowest imbalanced node

y = the child of z with highest subtree

x = the child of y with highest subtree

Page 37: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Rebalancing after an Insertion

z

y

Defn: h = height

h-1 h-3

h-2h-3

one is h-3 & one maybe h-4

The repair strategy called trinode restructuring

+2balanceFactor

-1

Assume second casebalanceFactor

z = the lowest imbalanced node

y = the child of z with highest subtree

x = the child of y with highest subtree

x

T2 T3

T4

T1

Page 38: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Rebalancing after an Insertion

z

y h-3

h-3

one is h-3 & one maybe h-4

The repair strategy called trinode restructuring

x

T2 T3

T4

T1

rotateL(y) z

x

y

y ≤ x ≤ z

Page 39: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Rebalancing after an Insertion

z

y h-3

h-3

one is h-3 & one maybe h-4

The repair strategy called trinode restructuring

x

T2 T3

T4

T1

rotateL(y)

rotateR(z)

z

x

y

y ≤ x ≤ zy z

x

Page 40: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Rebalancing after an Insertion

z

y

height = h

h-3

h-3

one is h-3 & one maybe h-4

The repair strategy called trinode restructuring

x

T2 T3

T4

T1

y z

x

T1 T2 T3T4

T1 ≤ y ≤ T2 ≤ z ≤ T3

h-1

h-2 h-2

• This subtree is balanced.• And shorter by one.• Hence the whole is an AVL Tree

one is h-3 & one maybe h-4

h-3h-3

Rest of Tree Rest of Tree

Page 41: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

41

7

19

31

23

13

2

1

15

1

0 0

3

0 0 0 0

0

1

3 2

1

4

5

3

40

0 0

1

2

9

11

0 0

1

2

8

0 0

1

Rebalancing after an InsertionExample: Insert 12

Page 42: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

42

7

19

31

23

13

2

1

15

1

0 0

3

0 0 0 0

0

1

3 2

1

4

5

3

40

0 0

1

2

9

11

0 0

1

2

8

0 0

1

w

Step 1.1: top-down search

Rebalancing after an InsertionExample: Insert 12

Page 43: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

43

7

19

31

23

13

2

1

15

1

0 0

3

0 0 0 0

0

1

3 2

1

4

5

3

40

0 0

1

2

9

11

0

1

2

8

0 0

1

w

Step 1.2: expand and insert new item in it

12

0 0

1

Rebalancing after an InsertionExample: Insert 12

Page 44: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

44

7

19

31

23

13

2

1

15

1

0 0

3

0 0 0 0

0

1

4 2

1

4

5

3

40

0 0

1

2

9

11

0

2

3

8

0 0

1

w

Step 2.1: move up along ancestral path of ; updateancestor heights; find unbalanced node.

12

0 0

1

imbalance

Rebalancing after an InsertionExample: Insert 12

Page 45: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

45

7

19

31

23

13

2

1

15

1

0 0

3

0 0 0 0

0

1

4 2

1

4

5

3

40

0 0

1

2

9

11

0

2

3

8

0 0

1

Step 2.2: trinode discovered (needs double rotation)

12

0 0

1

x

y

z

Rebalancing after an InsertionExample: Insert 12

Page 46: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

46

7

19

31

23

11

2

1

15

1

0 0

3

0 0

0 0

01

3 2

1

4

5

3

40

0 0

1

2

9 13

0

22

8

0 0

1

Step 2.3: trinode restructured; balance restored. DONE!

12

0 0

1

zy

x

Rebalancing after an InsertionExample: Insert 12

Page 47: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Rebalancing after a deletion

Very similar to before.

Unfortunately, trinode restructuring may reduce the height of the subtree, causing another imbalance further up the tree.

Thus this search and repair process must in the worst case be repeated until we reach the root.

See text for implementation.

Page 48: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Running Times for AVL Trees

a single restructure is O(1) using a linked-structure binary tree

find is O(log n) height of tree is O(log n), no restructures needed

insert is O(log n) initial find is O(log n)

Restructuring is O(1)

remove is O(log n) initial find is O(log n)

Restructuring up the tree, maintaining heights is O(log n)

Page 49: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Other Similar Balanced Trees

Red-Black Trees Balanced because of rules about red and black nodes

(2-4) Trees Balanced by having between 2 and 4 children

Splay Trees Moves used nodes to the root.

Page 50: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Union-Find Partition Structures

Andy Mirzaian 50Last Update: Dec 4, 2014

Page 51: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Partitions with Union-Find OperationsmakeSet(x): Create a singleton set containing

the element x and return the position storing x in this set

union(A,B ): Return the set A U B, destroying the old A and B

find(p): Return the set containing the element at position p

Andy Mirzaian 51Last Update: Dec 4, 2014

Page 52: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

List-based Implementation• Each set is stored in a sequence represented with

a linked-list• Each node should store an object containing the

element and a reference to the set name

Andy Mirzaian 52Last Update: Dec 4, 2014

Page 53: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Analysis of List-based Representation• When doing a union, always move elements

from the smaller set to the larger set Each time an element is moved it goes to a set of

size at least double its old set Thus, an element can be moved at most O(log n)

times• Total time needed to do n unions and finds is

O(n log n).

Andy Mirzaian 53Last Update: Dec 4, 2014

Page 54: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Tree-based ImplementationEach set is stored as a rooted tree of its elements:• Each element points to its parent.• The root is the “name” of the set.• Example: The sets “1”, “2”, and “5”:

Andy Mirzaian 54

1

74

2

63

5

108

12

119

Last Update: Dec 4, 2014

Page 55: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Union-Find Operations• To do a union, simply make

the root of one tree point to the root of the other

• To do a find, follow set-name pointers from the starting node until reaching a node whose set-name pointer refers back to itself

Andy Mirzaian 55

2

63

5

108

12

11

9

2

63

5

108

12

11

9

Last Update: Dec 4, 2014

Page 56: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Union-Find Heuristic 1• Union by size:

– When performing a union, make the root of smaller tree point to the root of the larger

• Implies O(n log n) time for performing n union-find operations:– Each time we follow a pointer,

we are going to a subtree of size at least double the size of the previous subtree

– Thus, we will follow at most O(log n) pointers for any find.

Andy Mirzaian 56

2

63

5

108

12

11

9

Last Update: Dec 4, 2014

Page 57: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Union-Find Heuristic 2• Path compression:

– After performing a find, compress all the pointers on the path just traversed so that they all point to the root

• Implies O(n log* n) time for performing n union-find operations:– [Proof is somewhat involved and is covered in EECS 4101]

Andy Mirzaian 57

2

63

5

108

12

11

9

2

63

5

108

12

11

9

Last Update: Dec 4, 2014

Page 58: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Java Implementation

Andy Mirzaian 58Last Update: Dec 4, 2014

Page 59: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

59

Heaps, Heap Sort, &Priority Queues

J. W. J. Williams, 1964

Page 60: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

60

Abstract Data Types

Restricted Data Structure:Some times we limit what operation can be done• for efficiency • understandingStack: A list, but elements can only be pushed onto and popped from the top.Queue: A list, but elements can only be added at the end and removed from the front. • Important in handling jobs.Priority Queue: The “highest priority” element is handled next.

Page 61: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

61

Priority Queues

Sorted List

UnsortedList

Heap

•Items arrive with a priority.

O(n) O(1) O(logn)

•Item removed is that with highest

priority.

O(1) O(n) O(logn)

Page 62: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

62

Heap Definition•Completely Balanced Binary Tree•The value of each node ³ each of the node's children. •Left or right child could be larger.

Where can 1 go?Maximum is at root.

Where can 8 go?

Where can 9 go?

Page 63: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

63

Heap Data StructureCompletely Balanced Binary Tree

Implemented by an Array

Page 64: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

64

Make Heap

Get help from friends

Page 65: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

65

Heapify

?Maximum needs

to be at root.Where is the maximum?

Page 66: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

66

Find the maximum. Put it in place

?

Repeat

Heapify

The 5 “bubbles” down until it finds its spot.

Page 67: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

67

Heap

Heapify

The 5 “bubbles” down until it finds its spot.

Page 68: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

68

Heap

Running Time:

Heapify

Page 69: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

69

Iterative

Page 70: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

70

Recursive

Page 71: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

71

Make Heap

Get help from friends

T(n) = 2T(n/2) + log(n)

Running time:

= Q(n)

Heapify

Heap

Recursive

Page 72: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

72

Heaps Heap?

Iterative

Page 73: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

73

?Heap

Iterative

Page 74: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

74

?Heap

Iterative

Page 75: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

75

?

Iterative

Page 76: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

76

Heap

Iterative

Page 77: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

77

Running Time:

i

log(n) -i2log(n) -i

Iterative

Page 78: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

78

Heap Pop/Push/Changes

With Pop, a Priority Queue returns the highest priority data item.

This is at the root.

21

21

Page 79: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

79

Heap Pop/Push/Changes

But this is now the wrong shape!To keep the shape of the tree,

which space should be deleted?

Page 80: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

80

Heap Pop/Push/Changes

What do we do with the element that was there?Move it to the root.

3

3

Page 81: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

81

Heap Pop/Push/Changes

But now it is not a heap!The left and right subtrees still are heaps.

3

3

Page 82: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

82

Heap Pop/Push/Changes

But now it is not a heap!

3

3

The 3 “bubbles” down until it finds its spot.

The max of these three moves up.

Time = O(log n)

Page 83: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

83

When inserting a new item,to keep the shape of the tree,

which new space should be filled?

21

21

Heap Pop/Push/Changes

Page 84: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

84

21

21

Heap Pop/Push/Changes

But now it is not a heap!The 21 “bubbles” up until it finds its spot.

The max of these two moves up.30

30

Time = O(log n)

Page 85: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

85

Adaptable Heap Pop/Push/Changes

But now it is not a heap!The 39 “bubbles” down or up until it finds its spot.

Suppose some outside userknows about

some data item cand remembers where

it is in the heap.And changes its priority

from 21 to 39 21 c

21

39

3927

Page 86: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

86

Adaptable Heap Pop/Push/Changes

But now it is not a heap!The 39 “bubbles” down or up until it finds its spot.

Suppose some outside useralso knows about

data item f and its location in the heap just changed.The Heap must be ableto find this outside userand tell him it moved.21 c

27

39

39

27 f

Time = O(log n)

Page 87: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

87

Heap Implementation• A location-aware heap entry

is an object storing key value position of the entry in the

underlying heap

• In turn, each heap position stores an entry

• Back pointers are updated during entry swaps

Last Update: Oct 23, 2014

Andy 87

4 a

2 d

6 b

8 g 5 e 9 c

Page 88: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

88

Selection Sort

Largest i values are sorted on side.Remaining values are off to side.

6,7,8,9<3

415

2

Exit

79 km 75 km

Exit

Max is easier to find if a heap.

Selection

Page 89: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

89

Heap Sort

Largest i values are sorted on side.Remaining values are in a heap.

Exit

79 km 75 km

Exit

Page 90: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

90

Heap Sort

Largest i values are sorted on side.Remaining values are in a heap.

Exit

79 km 75 km

Exit

Page 91: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

91

Heap Data Structure

Heap Array

5 3 4 2 1

98

76

Heap

Array

Page 92: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

92

Heap SortLargest i values are sorted on side.Remaining values arein a heap.

Exit

Put next value where it belongs.

79 km 75 km

Exit

Heap

?

Page 93: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

93

Heap Sort

Page 94: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

94

Heap Sort?

? ?

? ?

? ?

Sorted

Page 95: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

95

Heap Sort

Running Time:

Page 96: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

96

Communication & Entropy

Tell Uncle Lazare the location and the velocity of each particle.

The log of number of possibilities equals the

number of bits to communicate it

01101000

In thermodynamics, entropy is a measure of disorder.

Lazare Carnot (1803) 

It is a measured as the logarithm of the number of specific ways in which the micro world may be

arranged, given the macro world.

Few bits needed

Low entropy

Lots of bits needed

High entropy

Page 97: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

97

Communication & EntropyTell Uncle Shannon which toy you want

Bla bla bla bla bla bla

No. Please use the minimum number of bits to communicate it. 01101000

Great, but we need a code.

011 01 1 …

011 Oops. Was that or

 Claude Shannon (1948) 

Page 98: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

98

Communication & Entropy

00100 Claude Shannon

(1948) 

10 10 10 10 10

10

10

10

10

10

I follow the path and get

Use a Huffman Codedescribed by a binary tree.

Page 99: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

99

Communication & Entropy

 Claude Shannon (1948) 

10 10 10 10 10

10

10

10

10

10

Use a Huffman Codedescribed by a binary tree.

001000101

I first get , the I start over to get

Page 100: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

100

Communication & Entropy

 Claude Shannon (1948) 

10 10 10 10 10

10

10

10

10

10

Objects that are more likely will have shorter codes.

I get it.I am likely to answer .

so you give it a 1 bit code.

Page 101: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

101

Communication & Entropy

 Claude Shannon (1948) 

10 10 10 10 10

10

10

10

10

10

Pi is the probability of the ith toy.Li is the length of the code for the ith toy.

Li

The expected number of bits sent is = i pi Li

0.4950.130.110.050.0310.020.020.080.020.020.01Pi = 0.01

We choose the code lengths Li to minimized this.Then we call it the Entropy of the distribution on toys.

Page 102: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

102

Ingredients:• Instances: Probabilities of objects

<p1,p1,p2,… ,pn>.• Solutions: A Huffman code tree.

Cost of Solution: The expected number of bits sent = i pi Li

•Goal: Given probabilities, find code with minimum number of expected bits.

Communication & Entropy

Page 103: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

103

Communication & Entropy

0.025

Greedy Algorithm.• Put the objects in a priority queue sorted by probabilities.• Take the two objects with the smallest probabilities.• They should have the longest codes.• Put them in a little tree.• Join them into one object, with the sum probability.• Repeat.

0.4950.130.110.050.030.020.02 0.080.020.020.0150.01

Page 104: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

104

Communication & Entropy

0.020.02

Greedy Algorithm.• Put the objects in a priority queue sorted by probabilities.• Take the two objects with the smallest probabilities.• They should have the longest codes.• Put them in a little tree.• Join them into one object, with the sum probability.• Repeat.

0.0250.04

0.4950.130.110.050.030.020.02 0.08

Page 105: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

105

Communication & EntropyGreedy Algorithm.• Put the objects in a priority queue sorted by probabilities.• Take the two objects with the smallest probabilities.• They should have the longest codes.• Put them in a little tree.• Join them into one object, with the sum probability.• Repeat.

0.040.04 0.025

0.4950.130.110.050.030.020.02 0.08

Page 106: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

106

Communication & EntropyGreedy Algorithm.• Put the objects in a priority queue sorted by probabilities.• Take the two objects with the smallest probabilities.• They should have the longest codes.• Put them in a little tree.• Join them into one object, with the sum probability.• Repeat.

0.040.025

0.4950.130.110.050.03

0.04

0.08

0.055

Page 107: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

107

Communication & EntropyGreedy Algorithm.• Put the objects in a priority queue sorted by probabilities.• Take the two objects with the smallest probabilities.• They should have the longest codes.• Put them in a little tree.• Join them into one object, with the sum probability.• Repeat.

0.04

0.4950.130.110.05

0.04

0.08

0.0550.08

Page 108: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

108

Communication & EntropyGreedy Algorithm.• Put the objects in a priority queue sorted by probabilities.• Take the two objects with the smallest probabilities.• They should have the longest codes.• Put them in a little tree.• Join them into one object, with the sum probability.• Repeat.

0.4950.130.110.05 0.08

0.055 0.08

0.105

Page 109: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

109

Communication & EntropyGreedy Algorithm.• Put the objects in a priority queue sorted by probabilities.• Take the two objects with the smallest probabilities.• They should have the longest codes.• Put them in a little tree.• Join them into one object, with the sum probability.• Repeat.

0.4950.130.110.08

0.08

0.1050.16

Page 110: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

110

Communication & EntropyGreedy Algorithm.• Put the objects in a priority queue sorted by probabilities.• Take the two objects with the smallest probabilities.• They should have the longest codes.• Put them in a little tree.• Join them into one object, with the sum probability.• Repeat.

0.4950.130.11

0.1050.16

0.215

Page 111: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

111

Communication & EntropyGreedy Algorithm.• Put the objects in a priority queue sorted by probabilities.• Take the two objects with the smallest probabilities.• They should have the longest codes.• Put them in a little tree.• Join them into one object, with the sum probability.• Repeat.

0.4950.13

0.16

0.2150.29

Page 112: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

112

Communication & Entropy

0.495

0.2150.29

0.505

Page 113: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

113

Communication & Entropy

0.495

0.505

1

Page 114: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

114

Communication & Entropy

1

Greedy Algorithm.• Done when one object

(of probability 1)

Page 115: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

115

Communication & Entropy

 Claude Shannon (1948) 

Pi is the probability of the ith toy.Li is the length of the code for the ith toy.

0.4950.130.110.050.0310.020.020.080.020.020.01Pi = 0.01

Li

The expected number of bits sent is = i pi Li

Huffman’s algorithm says how to choose the code lengths Li to minimize the expected number of bits sent.We want a nice equation for this number.What if relax the condition that Li is an integer?

Page 116: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

116

Communication & Entropy

 Claude Shannon (1948) 

Pi is the probability of the ith toy.Li is the length of the code for the ith toy.

0.4950.130.110.050.0310.020.020.080.020.020.01Pi = 0.01

Li

The expected number of bits sent is = i pi Li

This is minimized by setting Li = log(1/pi)Why?•Suppose all toys had probability pi = 0.031,•Then there would be 1/pi = 32 toys,•Then the codes would have length •Li = log(1/pi)=5.

Page 117: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

117

Communication & Entropy

 Claude Shannon (1948) 

Pi is the probability of the ith toy.Li is the length of the code for the ith toy.

0.4950.130.110.050.0310.020.020.080.020.020.01Pi = 0.01

Li

The expected number of bits sent is = i pi Li

This is minimized by setting Li = log(1/pi)giving the expected number of bits is H(p) = i pi log(1/pi). (Entropy)(The answer given by Huffman Codes is at most one bit longer.)

Page 118: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

118

Communication & Entropy

 Claude Shannon (1948) 

Let X, Y, and Z be random variables. i.e. they take on random values according to some probability distributions.

0.4950.130.110.050.0310.020.020.080.020.020.01Pi = 0.01

Li

Once the values are chosen,the expected number of bits needed to communicate the value of X is …

H(p) = i pi log(1/pi). (Entropy)

H(X) = x pr(X=x) log(1/pr(X=x)).

X = toy chosen by this distribution.

Page 119: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

119

Entropy

The Entropy H(X) is the expected number of bits to communicate the value of X.

It can be drawn as the area of this circle.

Page 120: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

120

Entropy

H(XY) then is the expected number of bits to communicate the value of both X and Y.

Page 121: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

121

Entropy

If I tell you the value of Y, then H(X|Y) is the expected number of bits to communicate the value of X. Note that if X and Y are independent, then knowing Y does not help and H(X|Y) = H(X)

Page 122: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

122

Entropy

I(X;Y) is the number of bits that are revealed about X by me telling you Y.Note that if X and Y are independent, then knowing Y does not help and I(X;Y) = 0.

Or about Y by telling you X.

Page 123: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

123

Entropy

Page 124: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Splay Trees

Self-balancing BST

Invented by Daniel Sleator and Bob Tarjan

Allows quick access to recently accessed elements

Bad: worst-case O(n)

Good: average (amortized) case O(log n)

Often perform better than other BSTs in practice

D. Sleator

R. Tarjan

Page 125: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Splaying

Splaying is an operation performed on a node that iteratively moves the node to the root of the tree.

In splay trees, each BST operation (find, insert, remove) is augmented with a splay operation.

In this way, recently searched and inserted elements are near the top of the tree, for quick access.

Page 126: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

3 Types of Splay Steps

Each splay operation on a node consists of a sequence of splay steps.

Each splay step moves the node up toward the root by 1 or 2 levels.

There are 2 types of step: Zig-Zig

Zig-Zag

Zig

These steps are iterated until the node is moved to the root.

Page 127: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Zig-Zig

Performed when the node x forms a linear chain with its parent and grandparent. i.e., right-right or left-left

y

x

T1 T2

T3

z

T4

zig-zig

y

z

T4T3

T2

x

T1

Page 128: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Zig-Zag

Performed when the node x forms a non-linear chain with its parent and grandparent i.e., right-left or left-right

zig-zagy

x

T2 T3

T4

z

T1

y

x

T2 T3 T4

z

T1

Page 129: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Zig

Performed when the node x has no grandparent i.e., its parent is the root

zig

x

w

T1 T2

T3

y

T4

y

x

T2 T3 T4

w

T1

Page 130: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Splay Trees & Ordered Dictionaries

which nodes are splayed after each operation?

use the parent of the internal node w that was actually removed from the tree (the parent of the node that the removed item was swapped with)

remove(k)

use the new node containing the entry insertedinsert(k,v)

if key found, use that node

if key not found, use parent of external node where search terminated

find(k)

splay nodemethod

Page 131: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Recall BST Deletion Now consider the case where the key k to be removed is stored at a

node v whose children are both internal

we find the internal node w that follows v in an inorder traversal

we copy key(w) into node v

we remove node w and its left child z (which must be a leaf) by means of operation removeExternal(z)

Example: remove 3 – which node will be splayed?

3

1

8

6 9

5

v

w

z

2

5

1

8

6 9

v

2

Page 132: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Note on Deletion

The text (Goodrich, p. 463) uses a different convention for BST deletion in their splaying example Instead of deleting the leftmost internal node of the right subtree,

they delete the rightmost internal node of the left subtree.

We will stick with the convention of deleting the leftmost internal node of the right subtree (the node immediately following the element to be removed in an inorder traversal).

Page 133: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Performance

Worst-case is O(n) Example:

Find all elements in sorted order

This will make the tree a left linear chain of height n, with the smallest element at the bottom

Subsequent search for the smallest element will be O(n)

Page 134: 1 Jeff Edmonds York University COSC 2011 Lecture 6 Balanced Trees Dictionary/Map ADT Binary Search Trees Insertions and Deletions AVL Trees Rebalancing

Performance

Average-case is O(log n) Proof uses amortized analysis

We will not cover this

Operations on more frequently-accessed entries are faster. Given a sequence of m operations, the running time to access

entry i is:

where f(i) is the number of times entry i is accessed.