3 trees: traversal and analysis of standard search...

29
3 Trees: traversal and analysis of standard search trees Summer Term 2010 Robert Elsässer Robert Elsässer

Upload: others

Post on 15-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

3 Trees: traversal and analysis of standard search trees

Summer Term 2010

Robert ElsässerRobert Elsässer

Page 2: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Binary Search Trees

Binary trees for storing sets of keys (in the internal nodes ofBinary trees for storing sets of keys (in the internal nodes of trees), such that the operationsfindinsertdelete (remove)are supportedare supported.Search tree property: All keys in the left subtree of a node p are smallerthan the key of p, and the key of p is smaller than all keys in the right subtree of p.

Implementation:Implementation:

27.04.2010 Theory 1 - traversal and analysis of standard search trees 2

Page 3: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Standard binary search trees (8)

Insert 5

3

9

123

9

12

44

5

Tree structure depends on the order of insertions into the initially empty treeempty treeHeight can increase linearly, but it can also be in O(log n), more precisely ⎡log2 (n+1)⎤.

27.04.2010 Theory 1 - traversal and analysis of standard search trees 3

Page 4: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Traversal of trees

Traversal of the nodes of a treeTraversal of the nodes of a treefor outputfor calculating the sum, average, number of keys ...g , g , yfor changing the structure

Most important traversal orders:1. Preorder = NLR (Node-Left-Right)

fi t i it th t th i l th l ft d i htfirst visit the root, then recursively the left and right subtree (if existent)

2 Postorder = LRN2. Postorder = LRN3. Inorder = LNR4. The mirror image versions of 1-3

27.04.2010 Theory 1 - traversal and analysis of standard search trees 4

4. The mirror image versions of 1 3

Page 5: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Preorder

Preorder traversal is recursively defined as follows:Preorder traversal is recursively defined as follows:Traversal of all nodes of a binary tree with root p in preorder:

Visit p, traverse the left subtree of p in preorder,traverse the right subtree of p in preorder.

1717

2211

7 14

12

27.04.2010 Theory 1 - traversal and analysis of standard search trees 5

Page 6: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Preorder implementation

// Preorder Node-Left-Rightvoid preOrder ()

preOrder(root);System.out.println ();

void preOrder(SearchNode n)

if (n == null) return;System.out.print (n.content+" ");preOrder(n.left);p ( )preOrder(n.right);

// Postorder Left-Right-Node// Postorder Left Right Nodevoid postOrder()

postOrder(root);System.out.println ();

// ...

27.04.2010 Theory 1 - traversal and analysis of standard search trees 6

Page 7: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Inorder

The traversal order is: first the left subtree, then the root, then the , ,right subtree:

// I d L ft N d Ri ht// Inorder Left-Node-Rightvoid inOrder()

inOrder(root);System out println ();System.out.println ();

void inOrder(SearchNode n)

if (n == null) return;if (n == null) return; inOrder(n.left); System.out.print (n.content+" ");inOrder(n.right);inOrder(n.right);

27.04.2010 Theory 1 - traversal and analysis of standard search trees 7

Page 8: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Example

17Preorder:17, 11, 7, 14, 12, 22Postorder:

17

2211Postorder:7, 12, 14, 11, 22, 17Inorder:7 14

7, 11, 12, 14, 17, 2212

27.04.2010 Theory 1 - traversal and analysis of standard search trees 8

Page 9: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Non-recursive variants with threaded trees

Recursion can be avoided if instead of null-pointers so-called

27.04.2010 Theory 1 - traversal and analysis of standard search trees 9

Recursion can be avoided if instead of null pointers so called

thread pointers to the successors or predecessors are used.

Page 10: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Example for Search, Insertion, Deletion

1717

2211

7 14

12

27.04.2010 Theory 1 - traversal and analysis of standard search trees 10

Page 11: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Sorting with standard search trees

Idea: Create a search tree for the input sequence and output the keys by an i d t linorder traversal.

Remark: Depending on the input sequence, the search tree may degeneratedegenerate.

Complexity: Depends on internal path length

Worst case: Sorted input: ⇒ Ω(n2) steps.

Best case: We get a complete search tree of minimal height of about log n. est case e get a co p ete sea c t ee o a e g t o about ogThen n insertions and outputs are possible in time O(n log n).

Average case: ?

27.04.2010 Theory 1 - traversal and analysis of standard search trees 11

Page 12: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Analysis of search trees

Two possible approaches to determine the internal path length:p pp p g

1. Random tree analysis, i.e. average over all possible permutations of keys to be inserted (into the initially empty tree)permutations of keys to be inserted (into the initially empty tree).

2. Shape analysis, i.e. average over all structurally different trees ith kwith n keys .

Difference of the expected values for the internal path:

1. ≈ 1.386 n log2n – 0.846·n + O(log n)

2. ≈ n·√πn + O(n)

27.04.2010 Theory 1 - traversal and analysis of standard search trees 12

Page 13: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Reason for the difference

3 3 1 3

2 1 3 23

2

1

1 2 2 1

3,2,1 3,1,2 1,3,2 3,2,1 2,1,3 und 2,3,1, , , , , , , , , , , ,

Random tree analysis counts more balanced trees more often.

27.04.2010 Theory 1 - traversal and analysis of standard search trees 13

Page 14: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Internal path length

Internal path length I: measure for judging the quality of a search tree t.p g j g g q yRecursive definition:

1. If t is empty, then

2. For a tree t with left subtree tl and right subtree tr :

Apparently:

27.04.2010 Theory 1 - traversal and analysis of standard search trees 14

Page 15: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Average search path length

For a tree t the average search path length is defined by:For a tree t the average search path length is defined by:

Question: What is the size of D(t) in the

- best- worst- average

f ?case for a tree t with n internal nodes?

27.04.2010 Theory 1 - traversal and analysis of standard search trees 15

Page 16: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Internal path: best case

We obtain a complete binary treeWe obtain a complete binary tree

27.04.2010 Theory 1 - traversal and analysis of standard search trees 16

Page 17: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Internal path: worst case

27.04.2010 Theory 1 - traversal and analysis of standard search trees 17

Page 18: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Random trees

Without loss of generality, let 1,…,n be the keys to be inserted.g y, , , y

Let s1,…, sn be a random permutation of these keys.

Hence, the probability that s1 has the value k, P(s1=k) = 1/n.

If k is the first key, k will be stored in the root.

Then the left subtree contains k 1 elements (the keys 1 k 1)Then the left subtree contains k-1 elements (the keys 1, …, k-1) and the right subtree contains n-k elements (the keys k+1, …,n).

27.04.2010 Theory 1 - traversal and analysis of standard search trees 18

Page 19: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Expected internal path length

I(n) : Expectation for the internal path length of aI(n) : Expectation for the internal path length of a randomly generated binary search tree with n nodesApparently we have:Apparently we have:

Assume: EI(n) 1.386n log2n - 0.846n + O(logn).

27.04.2010 Theory 1 - traversal and analysis of standard search trees 19

Page 20: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Proof (1)

and hence

From the last two equations it follows thatFrom the last two equations it follows that

27.04.2010 Theory 1 - traversal and analysis of standard search trees 20

Page 21: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Proof (2)

By induction over n it is possible to show that for all n ≥ 1:y p

i th th h i bis the n-th harmonic number,

which can be estimated as follows:

where the so-called Euler constantwhere the so-called Euler constant.

27.04.2010 Theory 1 - traversal and analysis of standard search trees 21

Page 22: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Proof (3)

Thus,,

and hence,

27.04.2010 Theory 1 - traversal and analysis of standard search trees 22

Page 23: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Observation

Search, insertion and deletion of a key in a randomly generated binary y y g ysearch tree with n keys can be done, on average, in O(log2 n) steps.

In the worsten case, the complexity can be Ω(n).p y ( )

One can show that the average distance of a node from the root in a randomly generated tree is only about 40% above the optimal value.

However, by the restriction to the symmetrical successor, the behaviour becomes worse.

If n2 update operations are carried out in a randomly generated search tree with n keys, the expected average search path is only Θ(√n).

27.04.2010 Theory 1 - traversal and analysis of standard search trees 23

Page 24: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Typical binary tree for a random sequence of keys

27.04.2010 Theory 1 - traversal and analysis of standard search trees 24

Page 25: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Resulting binary tree after n2 updates

27.04.2010 Theory 1 - traversal and analysis of standard search trees 25

Page 26: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Structural analysis of binary trees

Question: What is the average search path length of a binary tree with Ng p g yinternal nodes if the average is made over all structurally different binary trees with N internal nodes?

Answer: Let

IN = total internal path length of all structurally different binary trees Nwith N internal nodes

BN = number of all structurally different trees with N internal nodesN

Then IN/BN =

27.04.2010 Theory 1 - traversal and analysis of standard search trees 26

Page 27: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Number of structurally different binary trees

27.04.2010 Theory 1 - traversal and analysis of standard search trees 27

Page 28: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Total internal path length of all trees with N nodes

For each tree t with left subtree tl and right subtree tr :l g r

27.04.2010 Theory 1 - traversal and analysis of standard search trees 28

Page 29: 3 Trees: traversal and analysis of standard search treeslak.informatik.uni-freiburg.de/.../Slides/03_Tree_traversal_analysis.pdf · 27.04.2010 Theory 1 -traversal and analysis of

Summary

The average search path length in a tree with N internal nodes (averaged g p g ( gover all structurally different trees with N internal nodes) is:

1/N · IN/BN

27.04.2010 Theory 1 - traversal and analysis of standard search trees 29