comp211slides_6

64
Trees Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T 0 ,T 1 ,…T m where T 0 contains one node E.G.M. Petrakis Trees 1 T 0 ,T 1 ,…T m where T 0 contains one node (root) and T 1 , T 2 …T m are trees T 0 T 2 T 1 T m T:

Upload: alexisthe

Post on 13-Dec-2015

216 views

Category:

Documents


1 download

DESCRIPTION

δομές αρχείων

TRANSCRIPT

Trees

� Definition (recursive): finite set of elements (nodes) which is either empty or it is partitioned into m + 1 disjoint subsets T0,T1,…Tm where T0 contains one node

E.G.M. Petrakis Trees 1

T0,T1,…Tm where T0 contains one node (root) and T1, T2…Tm are trees

T0

T2T1 Tm…

T:

Definitions

� Depth of tree node: length of the path from the root to the node� the root has depth 0

� Height of tree: depth of the deepest node

E.G.M. Petrakis Trees 2

� Height of tree: depth of the deepest node

� Leaf: all leafs are at level d

� Full tree: all levels are full

� Complete tree: all levels except perhaps level d are full

rootA

DB EC

HF JIG

sub-tree

E.G.M. Petrakis Trees 3

root (tree) = Adescendents (B) = {F,G} ≡ sons (B)ancestors (H) = {E} ≡ parent (H)degree (tree) = 4: max number of sonsdegree (E) = 3 level (H) = 2 level (A) = 0depth (tree) = 3 ≡ max level

Kleaf

Binary TreesΤ0

Τleft Τright

E.G.M. Petrakis Trees 4

� Finite set of nodes which is either empty or it is partitioned into sets T0, Tleft, Trightwhere T0 is the root and Tleft, Tright are binary trees

� A binary tree with n nodes at level l has at most2n nodes at level l+1

Full Binary Tree

� d: depth

� A binary tree has exactly 2l nodes at level l <= d

B

A

C

E.G.M. Petrakis Trees 5

level l <= d� Total number of

nodes: N = 2d+1 –1� N =

E FD G

all leaf nodesat the same level

122

2...22

1dd

0j

j

d10

−=

=+++

+

=

Binary Tree Traversal

� Traversal: list all nodes in ordera) preorder (depth first order):

roottraverse Tleft preorder traverse Tright preorder

E.G.M. Petrakis Trees 6

traverse Tright preorderb) inorder (symmetric order):

traverse Tleft inorderroottraverse Tright inorder

c) postorder:traverse Tleft

traverse Tright

root

A

C

E

B

F

preorder: ABDGCEHIFinorder: DGBAHEICFpostorder: GDBHIEFCA

E.G.M. Petrakis Trees 7

E F

HG

D

I

B

A

C D

preorder: ABCEIFJDGHKLinorder: EICFJBGDKHLApostorder: IEJFCGKLHDBA

E.G.M. Petrakis Trees 8

F G

J LIK

E H

Traversal of Trees of Degree >= 2

T0

T2T1 TΜ…

E.G.M. Petrakis Trees 9

T2T1 TΜ…

preorder•T0 (root)•T1 preorder•T2 preorder• …•TM preorder

inorder•T1 inorder•T0 (root)•T2 inorder• …•TM inorder

postorder•T1 postorder•T2 postorder• …•TM postorder•T0 (root)

B

C ED

A

G

F

E.G.M. Petrakis Trees 10

H

preorder: ABCDHEFGinorder: CBHDEAGFpostorder: CHDEBGFA

Binary Tree Node ADTinterface BinNode {

public Object element(); // Return and set the element valuepublic Object setElement(Object v);

public BinNode left(); // Return and set the left child

E.G.M. Petrakis Trees 11

public BinNode left(); // Return and set the left childpublic BinNode setLeft(BinNode p);

public BinNode right(); // Return and set the right childpublic BinNode setRight(BinNode p);

public boolean isLeaf(); // Return true if this is a leaf node

} // interface BinNode

Binary Tree Node Class

class BinNodePtr implements BinNode {

private Object element; // Object for this nodeprivate BinNode left; // Pointer to left childprivate BinNode right; // Pointer to right child

E.G.M. Petrakis Trees 12

public BinNodePtr() {left = right = null; } // Constructor 1public BinNodePtr(Object val) { // Constructor 2left = right = null;element = val;

}

public BinNodePtr(Object val, BinNode l, BinNode r) // Construct 3{ left = l; right = r; element = val; }

Binary Tree Node Class (cont.)

// Return and set the element valuepublic Object element() { return element; }public Object setElement(Object v) { return element = v; }

// Return and set the left childpublic BinNode left() { return left; }

E.G.M. Petrakis Trees 13

public BinNode left() { return left; }public BinNode setLeft(BinNode p) { return left = p; }

// Return and set the right childpublic BinNode right() { return right; } public BinNode setRight(BinNode p) { return right = p; }

public boolean isLeaf() // Return true if this is a leaf node{ return (left == null) && (right == null); }

} // class BinNodePtr

Binary Tree Traversal

void preorder (BinNode tree) {if (tree == NULL) return;

else {System.out.print(tree.element() + " ");

E.G.M. Petrakis Trees 14

System.out.print(tree.element() + " ");preorder ( tree.left( ) );preorder ( tree.right( ) );

}}

Binary Search Tree (BST)10

3

3015

20

1

E.G.M. Petrakis Trees 15

� Each node stores key K

� The nodes of Tleft have keys < K

� The nodes of Tright have keys >= K

30151

25

Search in BST

30

50

60

E.G.M. Petrakis Trees 16

1. Compare with root

2. If x == root(key) => key found !!

3. If x < key(root) search the Tleft recursively

4. If x >= key(root) search Tright recursively

3520 65

Find Min/Max Key

50

30 60

65 maxmin

E.G.M. Petrakis Trees 17

� Find the maximum key in a BST: follow nodes on Tright until key is found or NULL

� Find the minimum key in a BST: follow nodes on Tleft

3520 65 maxmin

Insert Key in BST50

35

6030

2065

E.G.M. Petrakis Trees 18

� Search until a leaf is found� insert it as left child of leaf if key(leaf) < x� insert it as right child of leaf if key(leaf) >= x

40

37

ρ

Shape of BST

� Depends on insertion order

� For random insertion sequence the BST is

10

5

71

20

E.G.M. Petrakis Trees 19

sequence the BST is more or less balanced� e.g., 10 20 5 7 1

� For order insertion sequence the BST becomes list� e.g., 1 5 7 10 20

1

5

20

7

10

Delete key x from BST

� Three cases:a) X is a leaf: simply delete it

b) X is not a leaf & it has exactly one sub-tree� the father node points to the node next to x

E.G.M. Petrakis Trees 20

� the father node points to the node next to x

� delete node(x)

c) X is not a leaf & it has two sub-trees� find p: minimum of Tright or maximum of Tleft

� this has at most one sub-tree!!

� delete it as in (a) or (b)

� substitute x with p

a) The key is a Leaf

E.G.M. Petrakis Trees 21

b) The key has one Sub-Tree

E.G.M. Petrakis Trees 22

c) The key has Two Sub-Trees

E.G.M. Petrakis Trees 23

BST Sorting

� Insert keys in BST

� Insertion sequence:50 30 60 20 35 40 37 65

30

50

60

E.G.M. Petrakis Trees 24

37 65

� Output keys inorder

� Average case: O(nlogn)

� Worst case: O(n2)37

3520

40

65

Implementation of BSTs

� Each node stores the key and pointers to roots of Tleft, Tright

� Array-based implementation: � known maximum number of nodes

E.G.M. Petrakis Trees 25

� known maximum number of nodes� fast� good for heaps

� Dynamic memory allocation:� no restriction on the number of nodes� slower but general

Dynamic Memory Allocation

� Two C++ classes� BinNode: node class (page 12)

� node operations: left, right, setValue, isleaf, father …

BST: tree class

E.G.M. Petrakis Trees 26

father …� BST: tree class

� composite operations: find, insert, remove, traverse …

� Use of “help” operations: � easier interface to class � encapsulation: implementation becomes private

Elem Interface

Definition of an Object with support for a key field:

interface Elem { // Interface for generic // element type

E.G.M. Petrakis Trees 27

// element typepublic abstract int key(); // Key used for search

// and ordering} // interface Elem

BST Class

class BST { // Binary Search Tree implementation

private BinNode root; // The root of the tree

public BST() { root = null; } // Initialize root to nullpublic void clear() { root = null; }

E.G.M. Petrakis Trees 28

public void clear() { root = null; }public void insert(Elem val){ root = inserthelp(root, val); }public void remove(int key){ root = removehelp(root, key); }public Elem find(int key){ return findhelp(root, key); }public boolean isEmpty() { return root == null; }

BST Class (cont.)

public void print() { // Print out the BST

if (root == null)

System.out.println("The BST is empty.");

else {

E.G.M. Petrakis Trees 29

else {

printhelp(root, 0);

System.out.println();

}

}

BST Class (cont.)

private Elem findhelp(BinNode rt, int key) {if (rt == null)

return null;Elem it = (Elem)rt.element();if (it.key() > key)

E.G.M. Petrakis Trees 30

Elem it = (Elem)rt.element();if (it.key() > key)

return findhelp(rt.left(), key);else if (it.key() == key)

return it;else

return findhelp(rt.right(), key);}

BST Class (cont.)

private BinNode inserthelp(BinNode rt, Elem val) {if (rt == null)

return new BinNodePtr(val);Elem it = (Elem)rt.element();if (it.key() > val.key())

E.G.M. Petrakis Trees 31

Elem it = (Elem)rt.element();if (it.key() > val.key())rt.setLeft(inserthelp(rt.left(), val));

elsert.setRight(inserthelp(rt.right(), val));

return rt;}

BST Class (cont.)

private BinNode deletemin(BinNode rt) {if (rt.left() == null)return rt.right();

else {rt.setLeft(deletemin(rt.left()));

E.G.M. Petrakis Trees 32

else {rt.setLeft(deletemin(rt.left()));return rt;

}}… … …

} // Binary Search Tree implementation

Array Implementation

� General for trees with degree d >= 2

� Many alternatives:1. array with pointers to father nodes

E.G.M. Petrakis Trees 33

1. array with pointers to father nodes

2. array with pointers to children nodes

3. array with lists of children nodes

4. binary tree array (good for heaps)

a) Pointers to Father NodesΑ

Β C

G

1 A 02 B 13 C 14 D 25 E 2 6 F 2

E.G.M. Petrakis Trees 34

� Easy to find ancestors

� Difficult to find children

� Minimum space

D E F G6 F 27 G 3

b) Pointers to Children Nodes

1 A 2 3 2 B 4 5 6 3 C 7 4 D

Α

Β C

E.G.M. Petrakis Trees 35

� Easy to find children nodes� Difficult to find ancestor nodes

4 D 5 E 6 F7 G D E F G

c) Lists of Children Nodes1 Α 2 3

2 B 4 5 6 3 C 7 4 D 5 E

Α

Β C

E.G.M. Petrakis Trees 36

5 E6 F7 G

� Easy to find children nodes� Difficult to find ancestor nodes

D E F G

d) Binary Tree Array7

4 6

1 2 3 5

74 6

1 2 3 5r: 0 1 2 3 4 5 6

7 4 6 1 2 3 5

E.G.M. Petrakis Trees 37

� parent ( r ) = (r-1)/2 if 0 < r < n

� leftchild ( r ) = 2r+1 if 2r + 1 < n

� rightchild ( r ) = 2r+2 if 2r + 2 < n

� leftsibling ( r ) = r-1 if r even & 0 < r < n

� rightsibling ( r ) = r+1 if r odd & 0 < r+1 < n

Heap

� Complete binary tree (not BST)

� Max heap: every node has a value >=

7

6 4

max element

min element

E.G.M. Petrakis Trees 38

node has a value >=children

� Min heap: every node has a value <= children

33

6 4

9

min element

Heap Operations

� insert new element in heap

� build heap from n elements

� delete max (min) element from heap

E.G.M. Petrakis Trees 39

� delete max (min) element from heap

� shift-down: move an element from the root to a leaf

� Array implementation of page 36

� In the following assume max heap

Insertion

new element

57

25

65

48=>

57

65

25

48=>

65

57

25

48

E.G.M. Petrakis Trees 40

1. Insert last in the array (as leaf) and 2. shift it to its proper position: swap it with parent

elements as long as it has greater value� Complexity O(logn): at most d shift operations in

array (d: depth of tree)

new element

Build Head

� Insert n elements in an empty heap� complexity: O(nlogn)

� Better bottom-up method:

R

HH

E.G.M. Petrakis Trees 41

� Assume H1, H2 heaps and shift R down to its proper position (depending on which root from H1, H2 has greater value)

� The same (recursively) within H1, H2

H2H1

Example

1

5 7 =>

7

5 1 =>

7

5 6

RH1 H2 R

E.G.M. Petrakis Trees 42

� Starts at d-1 level: the leafs need not be accessed

� Shift down elements

2 6 3 4 2 6 3 4 2 1 3

H1 H2

4

Complexity

� Up to n/21 elements at d level => 0 moves/elem

� Up to n/22 elements at d-1 level => 1 moves/elem

E.G.M. Petrakis Trees 43

…� Up to n/2i elements at level i => i-1 moves/elem

total number of moves Θ(n)1)n/2(ilognd

1i

i∈−∑

=

=

Deletion65

57

25

48=>

25

57

65

48

=>57

25 48

E.G.M. Petrakis Trees 44

� The max element is removed� swap with last element in array� delete last element� shift-down root to its proper position

� Complexity: O(logn)

25 65

Class Heap

public class Heap { // Heap classprivate Elem[] Heap; // Pointer to the heap arrayprivate int size; // Maximum size of the heapprivate int n; // Number of elements now in the heap

E.G.M. Petrakis Trees 45

public Heap(Elem[] h, int num, int max) // Constructor{ Heap = h; n = num; size = max; buildheap(); }

public int heapsize() // Return current size of the heap{ return n; }

public boolean isLeaf(int pos) // TRUE if pos is a leaf position{ return (pos >= n/2) && (pos < n); }

Class Heap (cont.)

// Return position for left child of pospublic int leftchild(int pos) {Assert.notFalse(pos < n/2, "Position has no left child");return 2*pos + 1;

}

E.G.M. Petrakis Trees 46

}

// Return position for right child of pospublic int rightchild(int pos) {Assert.notFalse(pos < (n-1)/2, "Position has no right child");return 2*pos + 2;

}

Class Heap (cont.)

public int parent(int pos) { // Return position for parentAssert.notFalse(pos > 0, "Position has no parent");return (pos-1)/2;

}

E.G.M. Petrakis Trees 47

public void buildheap() // Heapify contents of Heap{ for (int i=n/2-1; i>=0; i--) siftdown(i); }

Class Heap (cont.)

// Put element in its correct placeprivate void siftdown(int pos) {Assert.notFalse((pos >= 0) && (pos < n), "Illegal heap position");while (!isLeaf(pos)) {

int j = leftchild(pos);

E.G.M. Petrakis Trees 48

int j = leftchild(pos);if ((j<(n-1)) && (Heap[j].key() > Heap[j+1].key()))

j++; // j is now index of child with greater valueif (Heap[pos].key() > Heap[j].key()) return; // DoneDSutil.swap(Heap, pos, j);pos = j; // Move down

}}

Class Heap (cont.)

public void insert(Elem val) { // Insert value into heapAssert.notFalse(n < size, "Heap is full");int curr = n++;Heap[curr] = val; // Start at end of heap// Now sift up until curr's parent's > curr's key

E.G.M. Petrakis Trees 49

// Now sift up until curr's parent's > curr's keywhile ((curr!=0) && (Heap[curr].key()<Heap[parent(curr)].key()))

{DSutil.swap(Heap, curr, parent(curr));curr = parent(curr);

}}

Class Heap (cont.)

public Elem removemin() { // Remove minimum valueAssert.notFalse(n > 0, "Removing from empty heap");DSutil.swap(Heap, 0, --n); // Swap minimum with last valueif (n != 0) // Not on last element

siftdown(0); // Put new heap root val in correct place

E.G.M. Petrakis Trees 50

siftdown(0); // Put new heap root val in correct placereturn Heap[n];

}

Class Heap (cont.)

// Remove value at specified positionpublic Elem remove(int pos) {Assert.notFalse((pos > 0) && (pos < n), "Illegal heap position");DSutil.swap(Heap, pos, --n); // Swap with last valueif (n != 0) // Not on last element

E.G.M. Petrakis Trees 51

if (n != 0) // Not on last elementsiftdown(pos); // Put new heap root val in correct place

return Heap[n];}

} // Heap class

Heap Sort

� Insert all elements in new heap array

� At step i delete max element� delete: put max element last in array

E.G.M. Petrakis Trees 52

� delete: put max element last in array

� Store this element at position i+1

� After n steps the array is sorted!!

� Complexity: O(nlogn) why??

92

37 86

33 12 48 57

25

86

37 57

33 12 48 25

92

x[0]

x[2]

x[4] x[6]x[5]

x[7]

x[3]

x[1]

E.G.M. Petrakis Trees 53

25

(a) Initial heap 57

37 48

33 12 25 86

92

(b) x[7] = delete(92)

(c) x[6] = delete(86)

48

37 25

33 12 57 86

37

33 25

12 48 57 86

E.G.M. Petrakis Trees 54

92 92

(d) x[5] = delete(57) (e) x[4] = delete(48)

33

12 25

37 48 57 86

92

25 12

(ζ) x[3] = delete(37)

x[1]

E.G.M. Petrakis Trees 55

25

12 33

37 48 57 86

92

12

25 33

37 48 57 86

92(f) x[2] = delete(33) (g) x[1] = delete(25)

x[1]

x[3]x[2]

x[4]

x[8]x[5] x[6] x[7]

Huffman Coding

� Goal: improvement in space requirements in exchange of a penalty in running time

E.G.M. Petrakis Trees 56

in running time

� Assigns codes to symbols� code: binary number

� Main idea: the length of a code depends on its frequency� shorter codes to more frequent symbols

Huffman Coding (cont.)

� Input: sequence of symbols (letters)� typically 8 bits/symbol

� Output: a binary representation

E.G.M. Petrakis Trees 57

� the codes of all symbols are concatenated in the same order they appear in the input

� a code cannot be prefix of another

� less than 8 bits/symbol on the

average∑

∑=

=

=

N

1ii

N

1iii

n

pnn

Huffman Coding (cont.)

� The code of each symbol is derived from a binary tree� each leaf corresponds to a letter� goal: build a tree with the minimum external

path weight (epw)

E.G.M. Petrakis Trees 58

� goal: build a tree with the minimum external path weight (epw)

� epw: sum. of weighted path lengths� min epw <=> min n� a letter with high weight (frequency) should

have low depth => short code� a letter with low weight may be pushed deeper

in the Huffman tree => longer code

Building the Huffman Tree

� The tree is build bottom-up� Order the letters by ascending frequency� The first two letters become leaves of

the Huffman tree

E.G.M. Petrakis Trees 59

the Huffman tree� Substitute the two letters with one with

weight equal to the sum of their weights� Put this new element back on the ordered

list (in its correct place)� Repeat until only one element (root of

Huffman tree) remains in the list

Assigning Codes to Letters

� Beginning at the root assign 0 to left branches and 1 to right branches

� The Huffman code of a letter is the binary number determined by the path from the

E.G.M. Petrakis Trees 60

number determined by the path from the root to the leaf of that letter

� Longer paths correspond to less frequent letters (and the reverse)

� Replace each letter in the input with its code

0Weights: .12 .40 .15 .08 .25Symbols: a b c d e

Weights: .20 .40 .15 .25Symbols: b c e

1

3Weights: .60 .40Symbols: b

e

c

d a

Huffman Example

E.G.M. Petrakis Trees 61

42d a

Weights: .35 .40 .25Symbols: b e

c

d a

d a

Weights: 1.0Symbols:

ΗuffmanTreeb

ec

d a

Huffman codesa: 1111b: 0c:110

ΗuffmanTree

b

e =>

11

11

00

00

Huffman Encoding

E.G.M. Petrakis Trees 62

c:110d:1110e:10

ec

d a

=>

input: acde => Huffman code: 1111 110 1110 10

10

Huffman Decoding

1 1 1 1 ¦ 1 1 1 0 ¦ 1 0 ¦ 0 ¦ 1 1 0a ¦ d ¦ e ¦ b ¦ c

ΗuffmanTree

be

cd a

E.G.M. Petrakis Trees 63

� Repeat until end of code� beginning at the root, take right branch for

each 1 and left branch for each 0 until we reach letter

� output the letter

d a

Critique

� Advantages: � compression (40-80%)� cheaper transmission� some security (encryption)

locally optimal code

E.G.M. Petrakis Trees 64

� some security (encryption)� locally optimal code

� Disadvantages:� CPU cost� difficult error correction� we need the Huffman tree for the decoding� code not globally optimal