unit 3fi

61
Analysis and Design of Algorithms Unit 3 Sikkim Manipal University Page No.: 119 Unit 3 Elementary Data Structures Structure 3.1 Introduction Objectives 3.2 Stacks and queues 3.3 Linked lists 3.4 Hash tables 3.5 Collision resolution by chaining 3.6 Binary Search Trees 3.7 Red-Black Trees 3.8 Rotations 3.9 Insertion 3.10 Augmenting Data Structures 3.11 Determining the rank of an element 3.12 Maintaining subtree sizes 3.13 How to augment a data structure 3.14 Divide-and-Conquer 3.15 Integer Multiplication 3.16 Binary Search 3.17 Sorting 3.18 Quick Sort 3.19 Matrix Multiplication 3.20 Splay tree 3.21 Zig Step 3.22 Summary 3.23 Terminal Questions 3.24 Answers 3.1 Introduction A stack is an ordered list in which all insertions and deletions are made at one end, called the top. A queue is an ordered list in which all insertions

Upload: sanjeev-malik

Post on 09-Nov-2015

212 views

Category:

Documents


0 download

DESCRIPTION

kkk

TRANSCRIPT

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 119

    Unit 3 Elementary Data Structures

    Structure

    3.1 Introduction

    Objectives

    3.2 Stacks and queues

    3.3 Linked lists

    3.4 Hash tables

    3.5 Collision resolution by chaining

    3.6 Binary Search Trees

    3.7 Red-Black Trees

    3.8 Rotations

    3.9 Insertion

    3.10 Augmenting Data Structures

    3.11 Determining the rank of an element

    3.12 Maintaining subtree sizes

    3.13 How to augment a data structure

    3.14 Divide-and-Conquer

    3.15 Integer Multiplication

    3.16 Binary Search

    3.17 Sorting

    3.18 Quick Sort

    3.19 Matrix Multiplication

    3.20 Splay tree

    3.21 Zig Step

    3.22 Summary

    3.23 Terminal Questions

    3.24 Answers

    3.1 Introduction

    A stack is an ordered list in which all insertions and deletions are made at

    one end, called the top. A queue is an ordered list in which all insertions

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 120

    take place at one end, the rear, where as all deletions take place at the

    other end, the front.

    Objectives

    At the end of this unit the student should be able to:

    Review stacks, queues

    Understand Merge sort

    Find randomized quick sort

    3.2 Stacks and queues

    Stacks and queues are dynamic sets in which the element removed from

    the set by the DELETE operation is prespecified. In a stack, the element

    deleted from the set is the one most recently inserted: the stack implements

    a last-in, first-out, or LIFO, policy. Similarly, in a queue, the element

    deleted is always the one that has been in the set for the longest time: the

    queue implements a first-in, first-out, or FIFO, policy. There are several

    efficient ways to implement stacks and queue on a computer. In this section

    we show how to use a simple array to implement each.

    3.2.1 Stacks

    The INSERT operation on a stack is often called PUSH, and the DELETE

    operation, which does not take an element argument, is often called POP.

    These names are references to physical stacks, such as the spring-loaded

    stacks of plates used in cafeterias. The order in which plates are popped

    from the stack is the reverse of the order in which they were pushed onto

    the stack, since only the top plate is accessible.

    As shown in Figure 3.1, we can implement a stack of at most n elements

    with an array S [1n]. The array has an attribute top [S] that indexes the

    most recently inserted element. The stack consists of elements

    S [1 .. top [S]], where S[1] is the element at the bottom of the stack and

    S [top[S]] is the element at the top.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 121

    Figure 3.1 An array implementation of a stack S. Stack elements appear only

    in the lightly shaded positions. (a) Stack S has 4 elements. The top element is

    9. (b) Stack S after the calls PUSH (S, 17) and PUSH (S, 3).

    (c) Stack S after the call POP(S) has returned the element 3, which is the one

    most recently pushed. Although element 3 still appears in the array, it is no

    longer in the stack; the top is element 17.

    When top [S] = 0, the stack contains no elements and is empty. The stack

    can be tested for emptiness by the query operation STACK-EMPTY. If an

    empty stack is popped, we say the stack underflows, which is normally an

    error. If top [S] exceeds n, the stack overflows.

    The stack operations can each be implemented with a few lines of code.

    STACK EMPTY (S)

    1. if top [S] = 0

    2. then return TRUE

    3. else return FALSE

    PUSH (S, x)

    1. top [S] top [S] + 1

    2. S [top[S]] x

    POP (S)

    1. if STACK EMPTY (S)

    2. then error underflow

    3. else top [S] top [S] 1

    4. return S [top[S] +1]

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 122

    3.2.2 Queues

    We call the INSERT operation on a queue ENQUEUE, and we call the

    DELETE operation DEQUEUE; like the stack operation POP, DEQUEUE

    takes no element argument.

    Figure 3.2: A queue implemented using an array Q [1.12]. Queue elements

    appear only in the lightly shaded positions. (a) The queue has 5 elements, in

    locations Q [7.11]. (b) The configuration of the queue after the calls

    ENQUEUE (Q, 17), ENQUEUE (Q, 5). (c) The configuration of the queue after

    the call DEQUEUE (Q) returns the key value 15 formerly at the lead of the

    queue. The new head has key 6.

    The FIFO property of a queue causes it to operate like a line of people in the

    registrars office. The queue has a head and a tail. When an element is

    enqueued, it takes its place at the tail of the queue, just as a newly arriving

    student takes a place at the end of the line. The element dequeued is

    always the one at the head of the queue, like the student at the head of the

    line who has waited the longest.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 123

    Figure 3.2 shows one way to implement a queue of at most n 1 elements

    using an array Q [1..n]. The queue has an attribute head [Q] that indexes, or

    points to, its head. The attribute tail [Q] indexes the next location at which a

    newly arriving element will be inserted into the queue. The elements in the

    queue are in locations head [Q], head [Q] +1 ...tail [Q]1, where we wrap

    around in the sense that location 1 immediately follows location n in a

    circular order. When head [Q] = tail [Q], the queue is empty. Initially, we

    have head [Q] = tail [Q] = 1. When the queue is empty, an attempt to

    dequeue an element causes the queue to underflow. When head [Q] = tail

    [Q] + 1, the queue is full, and an attempt to enqueue an element causes the

    queue to overflow.

    3.3 Linked lists

    A linked list is a data structure in which the objects are arranged in a linear

    order. Unlike an array, though, in which the linear order is determined by the

    array indices, the order in a linked list is determined by a pointer in each

    object. Linked lists provide a simple, flexible representation for dynamic

    sets.

    As shown in Figure 3.3, each element of a doubly linked list L is an object

    with key field and two other pointer fields: next and prev. The object may

    also contain other satellite data. Given an element x in the list, next [x]

    points to its successor in the linked list, and prev [x] points to its

    predecessor. If prev [x]= NIL, the element x has no predecessor and is

    therefore the first element, or head, of the list. If next [x] = NIL, the element

    x has no successor and is therefore the last element, or tail, of the list. An

    attribute head [L] points to the first element of the list. IF head [L] = NIL, the

    list is empty.

    A list may have one of several forms. It may be either singly linked or doubly

    linked, it may be sorted or not, and it may be circular or not. If a list is singly

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 124

    linked, we omit the prev pointer in each element. If a list is sorted, the

    linear order of the list corresponds to the linear order of keys stored in

    elements of the list; the minimum element is the head of the list, and the

    maximum element is the tail. If the list is unsorted, the elements can appear

    in any order. In a circular list, the prev pointer of the head of the list points

    to the tail, and the next pointer of the tail of the list points to the head. The

    list may thus be viewed as a ring of elements.

    Figure 3.3 (a) A doubly linked list L representing the dynamic set

    {1, 4, 9, 16}. Each element in the list is an object with fields for the key and

    pointers (shown by arrows) to the next and previous objects. The next field of

    the tail and the prev field of the head are NIL, indicated by a diagonal slash.

    The attribute head [L] points to the head. (b) Following the execution of LIST-

    INSERT (L, x), where key [x] = 25, the linked list has a new object with key 25

    as the new head. This new object points to the old head with key 9. (c) The

    result of the subsequent call LIST-DELETE (L, x), where x points to the object

    with key 4.

    3.3.1 Searching a linked list

    The procedure LIST-SEARCH (L, k) finds the first element with key k in list L

    by a simple linear search, returning a pointer to this element. If no object

    with key k appears in the list, then NIL is returned. For the linked list in

    Figure 3.3 (a), the call

    LIST-SEARCH (L, 4) returns a pointer to the third element, and the call

    LIST-SEARCH (L, 7) returns NIL.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 125

    LIST SEARCH (L, k)

    1. x head [L]

    2. while x NIL and key [x] k

    3. do x next [x]

    4. return x

    To search a list of n objects, the LIST-SEARCH procedure takes n time

    in the worst case, since it may have to search the entire list.

    Inserting into a linked list

    Given an element x whose key field has already been set, the LIST-INSERT

    procedure splices x onto the front of the linked list, as shown in Figure 3.3(b).

    LIST INSERT (L, x)

    1. next [x] head [L]

    2. if head [L] NIL

    3. then prev [head[L]] x

    4. head [L] x

    5. prev [x] NIL

    3.3.2 Deleting from a linked list

    The procedure LIST-DELETE removes an element x from a linked L. It must

    be given a pointer to x, and it then splices x out of the list by updating

    pointers. If we wish to delete an element with a given key, we must first call

    LIST-SEARCH to retrieve a pointer to the element.

    LIST DELETE (L, x)

    1. if prev [x] NIL

    2. then next [prev[x]] next [x]

    3. else head [L] next [x]

    4. if next [x] NIL

    5. then prev [next[x]] prev [x]

    Figure 3.3(c) shows how an element is deleted from a linked list.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 126

    3.4 Hash tables

    The difficulty with direct addressing is obvious: if the universe U is large,

    storing a table T of size |U | may be impractical, or even impossible, given

    the memory available on a typical computer. Furthermore, the set K of keys

    actually stored may be so small relative to U that most of the space

    allocated for T would be wasted.

    When the set K of keys stored in a dictionary is much smaller than the

    universe U of all possible keys, a hash table requires much less storage

    than a direct-address table. Specifically, the storage requirements can be

    reduced to K while we maintain the benefit that searching for an element in the hash table still requires only O (1) time. The only catch is that

    this bound is for the average time, whereas for direct addressing it holds for

    the worst-cast time.

    With direct addressing, an element with key k is stored in slot k. With

    hashing, this element is stored in slot n (k); that is, we use a hash function

    h to compute the slot from the key k. Here h maps the universe U of keys

    into the slots of a hash table T [0.m-1]:

    h: U {0, 1,, m-1}

    We say that an element with key k hashes to slot h(k); we also say that h(k)

    is the hash value of key k. Figure 3.4(a) illustrates the basic idea. The point

    of the hash function is to reduce the range of array indices that need to be

    handled. Instead of |U | values, we need to handle only m values. Storage

    requirements are correspondingly reduced.

    There is one hitch: two keys may hash to the same slot. We call this

    situation a collision. Fortunately, there are effective techniques for

    resolving the conflict created by collisions.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 127

    Of course, the ideal solution would be to avoid collisions altogether. We

    might try to achieve this goal by choosing a suitable hash function h. One

    idea is to make h appear to be random, thus avoiding collisions or at least

    minimizing their number. The very term to hash, evoking images of

    random mixing and chopping, captures the spirit of this approach.

    Since |U | > m, however, there must be at least two keys that have the same

    hash value; avoiding collisions altogether is therefore impossible. Thus,

    while a well designed, random-looking hash function can minimize the

    number of collisions, we still need a method for resolving the collisions that

    do occur.

    The remainder of this section presents the simplest collision resolution

    technique, called chaining.

    Figure 3.4 (a) Using a hash function h to map keys to hash-table slots. Keys k2

    and k5 map to the same slot, so they collide.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 128

    Figure 3.4(b): Collision resolution by chaining. Each hash-table slot T [j]

    contains a linked list of all the keys whose hash value is j. For example,

    h (k1) = h (k4) and h (k5) = h (k2) = h (k1).

    3.5 Collision resolution by chaining

    In chaining, we put all the elements that hash to the same slot in a linked

    list, as shown in Figure 3.4. Slot j contains a pointer to the head of the list of

    all stored elements that hash to j; if there are no such elements, slot j

    contains NIL.

    The dictionary operations on a hash table T are easy to implement when

    collisions are resolved by chaining.

    3.6 Binary Search Trees

    Search trees are data structures that support many dynamic-set

    operations, including SEARCH, MINIMUM, MAXIMUM, PREDECESSOR,

    SUCCESSOR, INSERT, and DELETE. Thus, a search tree can be used

    both as a dictionary and as a priority queue.

    Basic operations on a binary search tree take time proportional to the height

    of the tree.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 129

    3.6.1 What is a binary search tree?

    A binary search tree is organized, as the name suggests, in a binary tree, as

    shown in Figure 3.5. Such a tree can be represented by a linked data

    structure in which each node is an object. In addition to a key field and

    satellite data, each node contains fields

    3.5 (a) 3.5(b)

    For any node x, the keys in the left subtree of x are at most key [x], and the

    keys in the right subtree of x are at least key [x]. Different binary search trees

    can represent the same set of values. The worst-case running time for most

    search-tree operations is proportional to the height of the tree. (a) A binary

    search tree on 6 nodes with height 2. (b) A less efficient binary search tree

    with height 4 that contains the same keys. left, right, and p that point to the

    nodes corresponding to its left child, its right child, and its parent, respectively.

    If a child or the parent is missing, the appropriate field contains the value NIL.

    The root node is the only node in the tree whose parent field is NIL.

    The keys in a binary search tree are always stored in such a way as to

    satisfy the binary-search-tree property:

    Let x be a node in a binary search tree. If y is a node in the left subtree of x,

    then key [y] < key [x]. If y is a node in the right subtree of x, then

    key [x] < key [y].

    Thus, in Figure 3.5(a), the key of the root is 5, the keys 2, 3, and 5 in its left

    subtree are no larger than 5, and the keys 7 and 8 in its right subtree are no

    smaller than 5. The same property holds for every node in the tree. For

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 130

    example, the key 3 in Figure 3.5(a) is no smaller than the key 2 in its left

    subtree and no larger than the key 5 in its right subtree.

    The binary-search-tree property allows us to print out all the keys in a binary

    search tree in sorted order by a simple recursive algorithm, called an

    inorder tree walk. This algorithm is so named because the key of the root

    of a subtree is printed between the values in its left subtree and those in its

    right subtree. (Similarly, a preorder tree walk prints the root before the

    values in either subtree, and a postorder tree walk prints the root after the

    values in its subtrees.) To use the following procedure to print all the

    elements in a binary search tree T, we call INORDER-TREE-WALK

    (root[T]).

    INORDER TREE WALK (x)

    1. if x NIL

    2. then INORDER TREE WALK (left[x])

    3. print key [x]

    4. INORDER TREE WALK (right [x])

    As an example, the inorder tree walk prints the keys in each of the two

    binary search trees from Figure 3.5 in the order 2, 3, 5, 5, 7, 8. The

    correctness of the algorithm follows by induction directly from the binary-

    search-tree property.

    It takes n time to walk an n-node binary search tree, since after the

    initial call, the procedure is called recursively exactly twice for each node in

    the treeonce for its left child and once for its right child. The following

    theorem gives a more formal proof that it takes linear time to perform an

    inorder tree walk.

    Theorem

    If x is the root of an n-node subtree, then the call INORDER TREE

    WALK () takes n time.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 131

    Proof Let T(n) denote the time taken by INORDER TREE WALK when

    it is called on the root of an n-node subtree. INORDER TREE WALK

    takes a small, constant amount of time on an empty subtree

    (for the test x NIL), and so c0T for some positive constant c.

    For n > 0, suppose that INORDER TREE WALK is called on a node x

    whose left subtree has k nodes and whose right subtree has n k 1

    nodes. The time to perform INORDER TREE WALK(x) is T (n) = T (k) +

    T (nk1)+d for some positive constant d that reflects the time to execute

    INORDER TREE WALK(x), exclusive of the time spent in recursive calls.

    We use the substitution method to show that nnT by proving that

    T(n)=(c+d)n+c. For n = 0, we have (c+d) 0+c=c=T(0), For n > 0, we have

    T (n) = T(k) + T (nk1) + d

    = ((c+d) k+c) + ((c+d) (n k 1) +c) + d

    = (c+d) n+c (c+d) +c+d

    =(c+d) n+c

    which completed the proof.

    3.6.2 Querying a binary search tree

    A common operation performed on a binary search tree is searching for a

    key stored in the tree. Besides the SEARCH operation, binary search trees

    can support such queries as MINIMUM, MAXIMUM, SUCCESSOR, and

    PREDECESSOR. In this section, we shall examine these operations and

    show that each can be supported in time O (h) on a binary search tree of

    height h.

    3.6.3 Searching

    We use the following procedure to search for a node with a given key in a

    binary search tree. Given a pointer to the root of the tree and a key k,

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 132

    TREE-SEARCH returns a pointer to a node with key k if one exists;

    otherwise, it returns NIL.

    Fig. 3.6

    Figure 3.6: Queries on a binary search tree. To search for the key 13 in the

    tree, we follow the path 137615 from the root. The minimum key in

    the tree is 2, which can be found by following left pointers from the root. The

    maximum key 20 is found by following right pointers from the root. The

    successor of the node with key 15 is the node with key 17, since it is the

    minimum key in the right subtree of 15. The node with key 13 has no right

    subtree, and thus its successor is its lowest ancestor whose left child is also

    an ancestor. In this case, the node with key 15 is its successor.

    TREE SEARCH (x, k)

    1. if x =NIL or k = key [x]

    2. then return x

    3. if k < key [x]

    4. then return TREE SEARCH (left [x]k)

    5. else return TREE SEARCH (right [x]k)

    The procedure begins its search at the root and traces a path downward in

    the tree Figure 3.6. For each node x it encounters, it compares the key k

    with key [x]. If the two keys are equal, the search terminates. If k is smaller

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 133

    than key [x], the search continues in the left subtree of x, since the binary-

    search-tree property implies that k could not be stored in the right subtree.

    Symmetrically, if k is larger than key [x], the search continues in the right

    subtree. The nodes encountered during the recursion form a path downward

    from the root of the tree, and thus the running time of TREE-SEARCH is O

    (h), where h is height of the tree.

    The same procedure can be written iteratively by unrolling the recursion

    into a while loop.

    ITERATIVE TREE SEARCH (x, k)

    1. while x NIL and k key [x]

    2. do if k

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 134

    right subtree is smaller than key [x] and every key in the left subtree in not

    larger than key [x], the minimum key in the subtree rooted at x can be found

    in the subtree rooted at left [x].

    The pseudo code for TREE-MINIMUM is symmetric.

    TREE-MINIMUM (x)

    1. while right [x] NIL

    2. do x right [x]

    3. return x

    Both of these procedures run in O (h) time on a tree of height h since, as in

    TREE SEARCH, the sequence of nodes encountered forms a path

    downward from the root.

    3.6.4 Successor and predecessor

    Given a node in a binary search tree, it is sometimes important to be able to

    find its successor in the sorted order determined by an in order tree walk. If

    all keys are distinct, the successor of a node x is the node with the smallest

    key greater than key [x]. The structure of a binary search tree allows us to

    determine the successor of a node without ever comparing keys. The

    following procedure returns the successor of a node x in a binary search

    tree if it exists, and NIL if x has the largest key in the tree.

    TREE SUCCESSOR (x)

    1. if right [x] NIL

    2. then return TREE MINIMUM (right [x])

    3. y p [x]

    4. while y NIL and x = right [y]

    5. do x y

    6. y p [y]

    7. return y

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 135

    The code for TREE SUCCESSOR is broken into two case. If the right

    subtree of node x is nonempty, then successor of x is just the leftmost node

    in the right subtree, which is found in line 2 by calling TREE MINIMUM

    (right [x]). For example, the successor of the node with key 15 in Figure 3.6

    is the node with key 17.

    On the other hand, if the right subtree of node x is empty and x has a

    successor y, then y is the lowest ancestor of x whose left child is also an

    ancestor of x. In Figure 3.6, the successor of the node with key 13 is the

    node with key 15. To find y, we simply go up the tree from x until we

    encounter a node that is the left child of its parent; this is accomplished by

    lines 3-7 of TREE SUCCESSOR.

    The running time of TREE SUCCESSOR on a tree of height h is O (h),

    since we either follow a path up the tree or follow a path down the tree.

    The procedure TREE PREDECESSOR, which is symmetric to TREE

    SUCCESSOR, also runs in time O (h).

    Even if keys are not distinct, we define the successor and predecessor of

    any node x as the node returned by calls made to TREE SUCCESSOR (x)

    and TREE PREDECESSOR (x), respectively.

    We have proved the following theorem.

    Theorem

    The dynamic-set operations SEARCH, MINIMUM, MAXIMUM, SUCCESSOR,

    and PREDECESSOR can be made to run in O (h) time on a binary search

    tree of height h.

    3.7 Red-Black Trees

    A red-black tree is a binary search tree with one extra bit of storage per

    node: its color, which can be either RED or BLACK. By constraining the

    way nodes can be colored on any path from the root to a leaf, red-black

    trees ensure that no such path is more than twice as long as any other, so

    that the tree is approximately balanced.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 136

    Each node of the tree now contains the fields color, key, left, right, and p. If

    a child or the parent of a node does not exist, the corresponding pointer field

    of the node contains the value NIL. We shall regard these NILs as being

    pointers to external nodes (leaves) of the binary search tree and the normal,

    key-bearing nodes as being internal nodes of the tree.

    A binary search tree is a red-black tree if it satisfies the following red-black

    properties:

    1. Every node is either red or black.

    2. The root is black.

    3. Every leaf (NIL) is black.

    4. If a node is red, then both its children are black.

    5. For each node, all paths from the node to descendant leaves contain the

    same number of black nodes.

    Figure 3.7 (a) shows an example of a red-black tree.

    As a matter of convenience in dealing with boundary conditions in red-black

    tree code, we use a single sentinel to represent NIL. For a red-black tree T,

    the sentinel nil [T] is an object with the same fields as an ordinary node in

    the tree. Its color field is BLACK, and its other fields p, left, right, and key

    can be set to arbitrary values. As Figure 3.7(b) show, all pointers to NIL are

    replaced by pointers to the sentinel nil [T].

    We use the sentinel so that we can treat a NIL child of a node x as an ordinary

    node whose parent is x. Although we instead could add a distinct sentinel node

    for each NIL in the tree, so that the parent of each NIL is well defined, that

    approach would waste space. Instead, we use the one sentinel nil [T] to

    represent all the NILs all leaves and the roots parent. The values of the fields

    p, left, right, and key of the sentinel are immaterial, although we may set them

    during the course of a procedure for our convenience.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 137

    We generally confine our interest to the internal nodes of a red-black tree,

    since they hold the key values. In the remainder of this chapter, we omit the

    leaves when we draw red-black trees, as shown in Figure 3.7 (c).

    We call the number of black nodes on any path from, but not including, a

    node x down to a leaf the black-height of the node, denoted bh (x). By

    property 5, the notion of black-height is well defined, since all descending

    paths from the node have the same number of black nodes. We define the

    black-height of a red-black tree to be the black-height of its root.

    The following lemma shows why red-black trees make good search trees.

    Lemma

    A red-black tree with n internal nodes has height at most 2lg (n+1).

    Proof We start by showing that the subtree rooted at any node x contains

    at least 2bh(x) 1 internal nodes. We prove this claim by induction on the

    height of x. If the height of x is 0, then x must be a sleaf (nil [T]), and the

    subtree rooted at x indeed contains at least 2bh(x) 1 = 20 1=0 internal

    nodes. For the inductive step, consider a node x that has positive height and

    is an internal node with two children. Each child has a black-height of either

    bh (x) or bh (x) 1, depending on whether its color is red or black,

    respectively. Since the height of a child of x is less than the height of x itself,

    we can apply the inductive hypothesis to conclude that each child has at

    least 2bh(x)1 1 internal nodes. Thus, the subtree rooted at x contains at

    least (2bh(x)1 1) + (2bh(x)1 1) + 1=2bh(x) 1 internal nodes, which proves

    the claim.

    To complete the proof of the lemma, let h be the height of the tree.

    According to property 4, at least half the nodes on any simple path from the

    root to a leaf, not including the root, must be black. Consequently, the black-

    height of the root must be at least2

    h; thus, 12n 2

    h

    .

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 138

    Moving the 1 to the left-hand side and taking logarithms on both sides yields

    1nlg2hor,2

    h1nlg .

    Figure 3.7: A red-black tree with black nodes darkened and red nodes shaded.

    Every node in a red-black tree is either red or black, the children of a red node

    are both black, or every simple path from a node to a descendant leaf

    contains the same number of black nodes. (a) Every leaf, shown as a NIL, is

    black. Each non-NIL node is marked with its black-height; NILs have black-

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 139

    height 0. (b) The same red-black tree but with each NIL replaced by the single

    sentinel nil [T], which is always black, and with black-heights omitted. The

    roots parent is also sentinel. (c) The same red-black tree but with leaves and

    the roots parent omitted entirely.

    3.8 Rotations

    The search-tree operations TREE-INSERT and TREE-DELETE, when run

    on a red-black tree with n keys, take O(lg n) time. Because they modify the

    tree, the result may violate the red-black properties. To restore these

    properties, we must change the colors of some of the nodes in the tree and

    also change the pointer structure.

    We change the pointer structure through rotation, which is a local operation

    in a search tree that preserves the binary-search-tree property. Figure 3.8

    shows the two kinds of rotations: left rotations and right rotations. When we

    do a left rotation on a node x, we assume that its right child y is not nil [T);

    x may be any node in the tree whose right child is not nil [T). The left

    rotation pivots around the link from x to y. It makes y the new root of the

    subtree, with x as ys left child and ys left child as xs right child.

    The pseudo code for LEFT-ROTATE assumes that right [x] nil [T] and that

    the roots parent is nil [T).

    Figure 3.8: The rotation operations on a binary search tree. The operation

    LEFT-ROTATE (T, x) transforms the configuration of the two nodes on the left

    into the configuration on the right by changing a constant number of pointers.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 140

    The configuration on the right can be transformed into the configuration on

    the left by the inverse operation RIGHT-ROTATE (T, y). The letters and

    represent arbitrary subtrees. A rotation operation preserves the binary-

    search-tree property: the keys in precede key [x], which precedes the keys

    in , which precede key [y], which precedes the keys in .

    3.9 Insertion

    Insertion of a node into an n-node red-black tree can be accomplished in

    O (lg n) time. We use a slightly modified version of the TREE-INSERT

    procedure to insert node z into the tree T as if it was an ordinary binary

    search tree, and then we color z red. To guarantee that the red-black

    properties are pre-served, we then call an auxiliary procedure RB-

    INSERT-FIXUP to recolor nodes and perform rotations. The call

    RB-INSERT (T, z) inserts node z, whose key field is assumed to have

    already been filled in, into the red-black tree T.

    RB INSERT (T, z)

    1. y nil [T]

    2. x root [T]

    3. while x nil [T]

    4. do y x

    5. if key [z] < key [x]

    6. then x left [x]

    7. else x right [x]

    8. p [z] y

    9. if y = nil [T]

    10. then root [T] z

    11. else if key [z] < key [y]

    12. then left [y] z

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 141

    13. else right [y] z

    14. left [z] nil [T]

    15. right [z] nil [T]

    16. color [z] RED

    17. RB INSERT FIXUP (T, z)

    There are four differences between the procedures TREE-INSERT and

    RB-INSERT. First, all instances of NIL in TREE-INSERT are replaced

    by Tnil . Second, we set left [z] and right [z] to nil [T] in lines 14 15 of

    R B-INSERT, in order to maintain the proper tree structure. Third, we color z

    red in line 16. Fourth, because coloring z red may cause a violation of one

    of the red-black properties, we call RB INSERT FIXUP (T, z) in line 17

    of RB INSERT to restore the red-black properties.

    3.10 Augmenting Data Structures

    The idea of augmenting data structure is fairly simple. We want to add data

    to the elements of one data structure that helps us to quickly get to some

    type of information.

    3.10.1 Dynamic order statistics

    The i th order statistic of a set of n elements, where i {1, 2, n}, is

    simply the element in the set with the i th smallest key. We saw that any

    order statistic could be retrieved in O (n) time from an unordered set. In this

    section, we shall see how red-black trees can be modified so that any

    orders statistic can be determined in O (lg n) time. We shall also see how

    the rank of an element its position in the linear order of the set can

    likewise be determined in O (lg n) time.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 142

    Figure 3.9 An order-statistic tree, which is an augmented red-black tree.

    Shaded nodes are red, and darkened nodes are black. In addition to its usual

    fields, each node x has a field size [x], which is the number of nodes in the

    subtree rooted at x.

    A data structure that can support fast order-statistics operations is shown in

    Figure 3.9. An order-statistic tree T is simply a red-black tree with

    additional information stored in each node. Besides the usual red-black tree

    fields key [x[, color [x], p [x], left [x] and right [x] in a node x, we have

    another field size [x]. This field contains the number of (internal) nodes in

    the subtree rooted at x (including x itself), that is, the size of the subtree. If

    we define the sentinels size to be 0, that is, we set size [nil [T]] to be 0,

    then we have the identity

    size [x] = size [left [x]] + size [right[x]] + 1.

    We do not require keys to be distinct in an order-statistic tree. (For example,

    the tree in Figure 8.1 has two keys with value 14 and two keys with

    value 21.) In the presence of equal keys, the above notion of rank is not well

    defined. We remove this ambiguity for an order-statistic tree by defining the

    rank of an element as the position at which it would be printed in an inorder

    walk of the tree. In Figure 3.9, for example, the key 14 stored in a black

    node has rank 5, and the key 14 stored in a red node has rank 6.

    3.10.2 Retrieving an element with a given rank

    Before we show how to maintain this size information during insertion and

    deletion, let us examine the implementation of two order-statistic queries

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 143

    that use this additional information. We begin with an operation that

    retrieves an element with a given rank. The procedure OS SELECT (x, i)

    returns a pointer to the node containing the i th smallest key in the subtree

    rooted at x. To find the i th smallest key in an order-statistic tree T, we call

    OS SELECT (root [T], I).

    OS SELECT (x, i)

    1. r size [left[x]] +1

    2. if i = r

    3. then return x

    4. elseif i < r

    5. then return OS SELECT (Left[x], i)

    6. else return OS SELECT (right [x], i r)

    The idea behind OS SELECT is similar to that of the selection algorithms.

    The value of size [left [x]] is the number of nodes that come before x in an

    inorder tree walk of the subtree rooted at x. Thus, size [left [x]] +1 is the rank

    of x within the subtree rooted at x.

    In line 1 of OS SELECT, we compute r, the rank of node x within the

    subtree rooted at x. If i = r, then node x is the i th smallest element, so we

    return x in line 3. If i < r, then the i th smallest element is in xs left subtree,

    so we recurse on left [x] in line 5. If i > r, then the i th smallest element is in

    xs right subtree. Since there are r elements in the subtree rooted at x that

    come before xs right subtree in an inorder tree walk, the i th smallest

    element in the subtree rooted at x is the (i r) th smallest element in the

    subtree rooted at right [x]. This element is determined recursively in line 6.

    To see how OS SELECT operates, consider a search for the 17th smallest

    element in the order-statistic tree of Figure 3.9. We begin with x as the root,

    whose key is 26, and with i = 17. Since the size of 26s left subtree is 12, its

    rank is 13. Thus, we know that the node with rank 17 is the 17 13 = 4th

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 144

    smallest element in 26s right subtree. After the recursive call, x is the node

    with key 41, and i = 4. Since the size of 41s left subtree is 5, its rank within

    its subtree is 6. Thus, we know that the node with rank 4 is the 4th smallest

    element in 41s left subtree. After the recursive call, x is the node with key

    30, and its rank within its subtree is 2. Thus, we recurse once again to find

    the 4 2 = 2nd smallest element in the subtree rooted at the node with key

    38. We now find that its left subtree has size 1, which means it is the second

    smallest element. Thus, a pointer to the node with key 38 is returned by the

    procedure.

    Because each recursive call goes down one level in the order-statistic tree,

    the total time for OS SELECT is at worst proportional to the height of the

    tree. Since the tree is a red-black tree, its height is O (lg n), where n is the

    number of nodes. Thus, the running time of OS SELECT is O (lg n) for a

    dynamic set of n elements.

    3.11 Determining the rank of an element

    Given a pointer to a node x in an order-statistic tree T, the procedure

    OS RANK returns the position of x in the linear order determined by an

    inorder tree walk of T.

    OS RANK (T, x)

    1. r size [left [x]] + 1

    2. y x

    3. while y root [T]

    4. do if y = right [p[y]]

    5. then r r + size [left [p[y]]] + 1

    6. y p [y]

    7. return r

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 145

    The procedure works as follows. The rank of x can be viewed as the number

    of nodes preceding x in an inorder tree walk, plus 1 for x itself. OS RANK

    maintains the following loop invariant:

    At the start of each iteration of the while loop of lines 3 6, r is the rank of

    key [x] in the subtree rooted at node y.

    We use this loop invariant to show that OS RANK works correctly as

    follows:

    Initialization: Prior to the first iteration, line 1 sets r to be the rank of key [x]

    within the subtree rooted at x. Setting y x in line 2 makes the invariant

    true the first time the test in line 3 executes.

    Maintenance: At the end of each iteration of the while loop, we set

    y p [y]. Thus we must show that if r is the rank of key [x] in the subtree

    rooted at y at the start of the loop body, then r is the rank of key [x] in the

    subtree rooted at p[y] at the end of the loop body. In each iteration of the

    while loop, we consider the subtree rooted at p [ y ]. We have already

    counted the number of nodes in the subtree rooted at node y that precede x

    in an inorder walk, so we must add the nodes in the subtree rooted at ys

    sibling that precede x in an inorder walk, plus 1 for p [y] if it, too, precedes x.

    If y is a left child, then neither p [y] nor any node in p [y] s right subtree

    precedes x, so we leave r alone. Otherwise, y is a right child and all the

    nodes in p [y] s left subtree precede x, as does p [y] itself. Thus, in line 5,

    we add size [left[p[y]]]+1 to the current value of r.

    Termination: The loop terminates when y = root [T], so that the subtree

    rooted at y is the entire tree. Thus, the value of r is the rank of key [x] in the

    entire tree.

    As an example, when we run OS RANK on the order-statistic tree of

    Figure 3.9 to find the rank of the node with key 38, we get the following

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 146

    sequence of values of key [y] and r at the top of the while loop:

    Iteration Key [y] R

    1 38 2

    2 30 4

    3 41 4

    4 26 17

    Figure 3.10 Updating subtree sizes during rotations. The link around which the rotation is performed is incident on the two nodes whose size fields need to be updated. The updates are local, requiring only the size information stored in x, y, and the roots of the subtrees shown as triangles.

    The rank 17 is returned.

    Since each iteration of the while loop takes O (1) time, and y goes up one

    level in the tree with each iteration, the running time of OS RANK is at

    worst proportional to the height of the tree: O (lg n) on an n-node order-

    statistic tree.

    3.12 Maintaining subtree sizes

    Given the size field in each node, OS SELECT and OS RANK can

    quickly compute order-statistic information. But unless these fields can be

    efficiently maintained by the basic modifying operations on red-black tree,

    our work will have been for naught. We shall now show that subtree sizes

    can be maintained for both insertion and deletion without affecting the

    asymptotic running time of either operation.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 147

    We know that insertion into a red-black tree consists of two phases. The first

    phase goes down the tree from the root, inserting the new node as a child of

    an existing node. The second phase goes up the tree, changing colors and

    ultimately performing rotations to maintain the red-black properties.

    To maintain the subtree sizes in the first phase, we simply increment size [x]

    for each node x on the path traversed from the root down toward the leaves.

    The new node added gets a size of 1. Since there are O (lg n) nodes on the

    traversed path, the additional cost of maintaining the size fields is O (lg n).

    In the second phase, the only structural changes to the underlying red-black

    tree are caused by rotations, of which there are at most two. Moreover, a

    rotation is a local operation: only two nodes have their size fields

    invalidated. The link around which the rotation is performed is incident on

    these two nodes. Referring to the code for LEFT ROTATE (T, x), we add

    the following lines:

    12 size [y] size [x]

    13 size [x] size [left [x]] + size [ right [x]] + 1

    Figure 3.10 illustrates how the fields are updated. The change to RIGHT

    ROTATE is symmetric.

    Since at most two rotations are performed during insertion into a red-black

    tree, only O (1) additional time is spent updating size fields in the second

    phase. Thus, the total time for insertion into an n-node order-statistic tree is

    O (lg n), which is asymptotically the same as for an ordinary red-black tree.

    Deletion from a red-black tree also consists of two phases: the first operates

    on the underlying search tree, and the second causes at most three rotations

    and otherwise performs no structural changes. The first phase splices out one

    node y. To update the subtree sizes, we simply traverse a path from node y

    up to the root, decrementing the size field of each node on the path. Since

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 148

    this path has length O (lg n) in an n-node red-black tree, the additional time

    spent maintaining size fields in the first phase is O (lg n). The O (1) rotations

    in the second phase of deletion can be handled in the same manner as for

    insertion. Thus, both insertion and deletion, including the maintenance of the

    size fields, take O (lg n) ime for an n-node order-statistics tree.

    3.13 How to augment a data structure

    The process of augmenting a basic data structure to support additional

    functionality occurs quite frequently in algorithm design. It will be used again

    in the next section to design a data structure that supports operation on

    intervals. In this section, we shall examine the steps involved in such

    augmentation. We shall also prove a theorem that allows us to augment red-

    black trees easily in many cases.

    Augmenting a data structure can be broken into four steps:

    1. choosing an underlying data structure,

    2. determining additional information to be maintained in the underlying

    data structure,

    3. verifying that the additional information can be maintained for the basic

    modifying operations on the underlying data structure, and

    4. developing new operations.

    As with any prescriptive design method, you should not blindly follow the

    steps in the order given. Most design work contains an element of trial and

    error, and progress on all steps usually process in parallel. There is no

    point, for example, in determining additional information and developing new

    operations (steps 2 and 4) if we will not be able to maintain the additional

    information efficiently. Nevertheless, this four-step method provides a good

    focus for your efforts in augmenting a data structure, and it is also a good

    way to organize the documentation of an augmented data structure.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 149

    We followed these steps to design our order-statistic trees. For step 1, we

    chose red-black trees as the underlying data structure. A clue to the

    suitability of red-black trees comes from their efficient support of other

    dynamic-set operations on a total order, such as MINIMUM, MAXIMUM,

    SUCCESSOR, and PREDECESSOR.

    For step 2, we provided the size field, in which each node x stores the size

    of the subtree rooted at x. Generally, the additional information makes

    operations more efficient. For example, we could have implemented

    OS SELECT OS SELECT and OS RANK using just the keys stored in

    the tree, but they would not have run in O(lg n) time. Sometimes, the

    additional information is pointer information rather than data.

    For step 3, we ensured that insertion and deletion could maintain the size

    fields while still running in O(lg n) time. Ideally, only a few elements of the

    data structure should need to be updated in order to maintain the additional

    information. For example, if we simply stored in each node its rank in the

    tree, the OS SELECT and OS RANK procedures would run quickly, but

    inserting a new minimum element would cause a change to this information

    in every node of the tree. When we store subtree sizes instead, inserting a

    new element causes information to change in only O (lg n) nodes.

    For step 4, we developed the operations OS SELECT and OS RANK.

    After all, the need for new operations is why we bother to augment a data

    structure in the first place. Occasionally, rather than developing new

    operations, we use the additional information to expedite existing ones.

    3.14 Divide-and-Conquer

    The general plan for Divide-and-Conquer technique has the following three

    major steps:

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 150

    Step 1: An instance of the problem to be solved, is divided into a number of

    smaller instances of the (same) problem, generally of equal sizes. Any sub-

    instance may be further divided into its sub-instances. A stage reaches

    when either a direct solution of a sub-instance at some stage is available or

    it is not further sub-divisible. In the latter case, when no further sub-division

    is possible, we attempt a direct solution for the sub-instance.

    Step 2: Such smaller instances are solved.

    Step 3: Combine the solutions so obtained of the smaller instances to get

    the solution of the original instance of the problem.

    In this unit, we will study the technique, by applying it in solving a number of

    problems.

    3.14.1 General Issues in Divide-and-Conquer

    Recalling from the introduction, Divide-and-Conquer is a technique of

    designing algorithms that proceeds as follows:

    Given an instance of the problem to be solved, split this into more than one

    sub-instances (of the given problem). If possible, divide each of the sub-

    instances into smaller instances, till a sub-instance has a direct solution

    available or no further subdivision is possible. Then independently solve

    each of the sub-instances and then combine solutions of the sub-instances

    so as to yield a solution for the original instance.

    The methods by which sub-instances are to be independently solved play

    an important role in the overall efficiency of the algorithm.

    Example

    We have an algorithm, alpha say, which is known to solve all instances of

    size n, of a given problem, in at most cn2 steps (where c is some constant).

    We then discover an algorithm, beta say, which solves the same problem

    by:

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 151

    Dividing an instance into 3 sub-instances of size 2

    n.

    Solve these 3 sub-instances.

    Combines the three sub-solutions taking d n steps in combining.

    Suppose our original algorithm alpha is used to carry out the step 2, viz.,

    solve these sub-instances. Let

    T (alpha) (n) = Running time of alpha on an instance of size n.

    T (beta) (n) = Running time of beta on an instance of size n.

    Then,

    T (alpha) (n) = 2

    nc (by definition of alpha)

    But

    T (beta) (n) = nd2

    nalpha T3

    = ndnc4

    3 2

    So if

    n

    c4

    d.,e.i

    4

    cndn

    2

    then beta is faster than alpha.

    In particular, for all large enough ns, (viz., for ttanConsc

    d4n ), beta is

    faster than alpha.

    The algorithm beta improves upon the algorithm alpha by just a constant

    factor. But if the problem size n is large enough such that for some i > 1, we

    have

    c

    d4n and also

    c

    d4

    2

    n and even

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 152

    c

    d4

    2

    ni

    which suggests that using beta instead of alpha for the Step 2 repeatedly

    until the sub-sub-subsub-instances are of size

    c

    d4n0 , will yield a

    still faster algorithm.

    So consider the following new algorithm for instances of size n.

    3.14.2 Procedure gamma (n: problem size),

    If n n0 then

    Solve problem using Algorithm alpha;

    else

    Split the problem instance into 3 sub-instances of size 2

    n;

    Use gamma to solve each sub-instance;

    Combine the 3 sub-solutions;

    end if;

    end gamma;

    Let T (gamma) (n) denote the running time of this algorithm. Then

    otherwise,dn

    2

    ngammaT3

    nnifcn

    ngammaT0

    2

    we shall show how relations of this form can be estimated. Later in the

    course, with these methods it can be shown that

    59.13log nOnOngammaT This is a significant improvement upon algorithms alpha and beta, in view of

    the fact that as n becomes larger the differences in the values of 59.1

    n and

    2n becomes larger and larger.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 153

    The improvement that results from applying algorithms gamma is due to the

    fact that it maximizes the savings achieved through beta. The (relatively)

    inefficient method alpha is applied only to small problem sizes.

    The precise form of a divide-and-conquer algorithm is characterized by:

    i) The threshold input size, n0, below which the problem size is not further

    sub-divided.

    ii) The size of sub-instances into which an instance is split.

    iii) The number of such sub-instances.

    iv) The method for solving instances of size n n0.

    v) The algorithm used to combine sub-solutions.

    In (ii), it is more usual to consider the ratio of initial problem size to sub-

    instance size. In our example, the ratio was 2. The threshold in (i) is

    sometimes called the (recursive) base value. In summary, the generic form

    of a divide-and-conquer algorithm is:

    Procedure D-and-C (n : input size);

    begin

    read (n0) ; read the threshold value.

    thennnif 0

    Solve problem without further sub-division;

    else

    Split into sub-instances each of size k

    n;

    for each of the r sub-instances do

    k

    nCandD ;

    Combine the resulting sub-solutions to produce the solution to the original

    problem;

    end if;

    end D and C.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 154

    Such algorithms are naturally and easily realized as recursive procedures in

    (suitable) high-level programming languages.

    3.15 Integer Multiplication

    The following problem is a classic example of the application of Divide-and-

    Conquer technique in the field of Computer Science.

    Input: Two n-digit decimal numbers x and y represented as

    x = xn1 xn2 .. x1 x0 and

    y = yn1 yn2 .y1 y0

    where ii yx , are decimal digits.

    Output: The (2n) -digit decimal representation of the product x y.

    z = z2n1 z2n-2 z2n-3 z1 z0

    Note: The algorithm given below works for any number base, e.g., binary,

    decimal, hexadecimal, etc. We use decimal simply for convenience.

    The classical algorithm for multiplication requires O (n2) steps to multiply two

    n-digit numbers.

    A step is regarded as a single operation involving two single digit numbers,

    e.g., 5+6, 3 * 4, etc.

    In 1962, A. A. Karatsuba discovered an asymptotically faster algorithm for

    multiplying two numbers by using a divide-and-conquer approach.

    The values x and y of the numbers with representations.

    x = xn1 xn-2 x1 x0 and

    y = yn1 yn2 y1 y0

    are clearly given by,

    1n

    0i

    ii 10xx ; and

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 155

    1n

    0i

    ii 10yy .

    Then, the resultant value z = x * y

    with representation

    z = z2n1 zn2 z2n3 z1 z0

    is given by

    1n

    0i

    ii

    1n

    0i

    ii

    1n2

    0i

    ii 10y10x10zz

    For example:

    581 = 5 * 102 + 8 * 101 + 1 * 100

    602 = 6 * 102 + 0 * 101 + 2 * 100

    581 * 602 = 349762 = 3 *105 + 4 *104 + 9 *103 + 7 *102 + 6 *101 + 2 *10 0

    Let us denote

    2

    n1

    2

    n2n1nxx.........xxa

    012

    2

    n1

    2

    nxx.........xxb

    2

    n1

    2

    n2n1nyy.........yyc

    012

    2

    n1

    2

    nyy.........yyd

    Where

    2

    n = largest integer less than or equal to

    2

    n.

    Then, if a, b, c and d are the numbers whose decimal representations are

    a, b, c and d then

    d10cy;b10ax2

    n

    2

    n

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 156

    For example, if

    n = 4, x = 1026 and y = 7329 then a = 10, b = 26, c =73 and d = 29,

    and,

    x = 1026 = 10 * 102 + 26 = a * 102 + b

    y = 7329 = 73 * 102 + 29 = c * 102 + d

    From this we also know that the result of multiplying x and y (i.e., z) is

    d10cb10ayxz2

    n

    2

    n

    db10cbda10ca 2n

    2

    n2

    where

    oddisnif1n

    evenisnif,n

    2

    n2

    e.g., 1026 * 7329 is

    = (10 * 73) * 104 + (10 * 29 + 26 * 73) * 102 + (26 * 29)

    = 730 * 104 + 2188 * 102 + 754 = 7,519,554

    Each of the term (a * c), (a * d), (b * c) and (b * d) is a product of two digit

    numbers.

    Thus the expression for the multiplication of x and y in terms of the numbers

    a, b, c and d tells us that:

    1. Two single digit numbers can be multiplied immediately.

    (Recursive base: step 1)

    2. If n > 1 then the product of two n-digit numbers can be expressed in

    terms of 4 products of two numbers* (Divide-and-Conquer stage)

    * For a given n-digit number, whenever we divides the sequence of

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 157

    digits into two subsequences, one of which has

    2

    n digits, the other

    subsequence has

    2

    nn digits, which is

    2

    n digits if n is even and

    2

    1n if n is odd. However, because of the convenience we may call

    both as numbers

    sequencesdigit2

    n

    .

    3. Given the four returned products, the calculation of the result of

    multiplying x and y involves only additions (can be done in O (n) steps)

    and multiplications by a power of 10 (also can be done in O (n) steps,

    since it only requires placing the appropriate number of 0s at the end of

    the number). (Combine stage).

    Steps 13, therefore, describe a Divide-and-Conquer algorithm for

    multiplying two n-digit numbers represented in decimal. However, Karatsuba

    discovered how the product of two n-digit numbers could be expressed in

    terms of three products each of two

    2

    n- digit numbers, instead of the

    four products that a conventional implementation of the Divide-and-Conquer

    schema, as above, uses.

    This saving is accomplished at the expense of a slightly more number of

    steps taken in the combine stage (Step 3) (although, this will still uses

    O (n) operations).

    We continue with the earlier notations in which z is the product of two

    numbers x and y having respectively the decimal representations

    x = xn1 xn2 x1 x0

    y = yn1 yn2 .y1 y0

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 158

    Further a, b, c, d are the numbers whose decimal representations are given by

    2

    n1

    2

    n2n1nxx.........xxa

    012

    2

    n1

    2

    nxx.........xxb

    2

    n1

    2

    n2n1nyy.........yyc

    012

    2

    n1

    2

    nyy.........yyd

    Let us computer the following 3 products (of two

    2

    n-digit numbers):

    U = a * c

    V = b * d

    W = (a+b) * (c +d)

    Then

    W = a * c + a * d + b * c + b * d

    = U + a * d + b * c + V

    Therefore,

    a * d + b * c = W U V.

    Therefore,

    d10cb10ayxz2

    n

    2

    n

    db10cbda10ca 2n

    2

    n2

    V10VUW10U 2n

    2

    n2

    This algorithm is formalized through the following function.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 159

    Function Karatsuba (xunder, yunder : n-digit integer; n : integer)

    a, b, c, d:

    2

    n-digit integer

    U, V, W: n-digit integer;

    begin

    if 1n then

    ;yxreturn 00

    else

    ;x..........x:a

    2

    n1n

    ;x..........x:b 01

    2

    n

    ;y..........y:c

    2

    n1n

    ;y..........y:d 01

    2

    n

    ;2

    n,c,aKaratsuba:U

    ;2

    n,d,bKaratsuba:V

    ;2

    n,dc,baKaratsuba:W

    Return ;V2

    n10VUW

    2

    n210U

    ; where m10

    stands for 10 raised to power m.

    end if ; end Karatsuba ;

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 160

    3.16 Binary Search

    Binary Search algorithm searches a given value or element in an already

    sorted array by repeatedly dividing the search interval in half. It first

    compares the value to be searched with the item in the middle of the array.

    If there is a match, then search is successful and we can return the

    required result immediately. But if the value does not match with the item in

    middle of the array, then it finds whether the given value is less than or

    greater than the value in the middle of array. If the given value is less than

    the middle value, then the value of the item sought must lie in the lower half

    of the array. However, if the given value is greater than the item sought,

    must lie in the upper half of the array. So we repeat the procedure on the

    lower or upper half of the array according as the given value is respectively

    less than or greater than the middle value. The following C++ function gives

    the Binary Search Algorithm.

    int Binary Search (int * A, int low, int high, int value)

    { int mid:

    While (low < high)

    { mid = (low + high) / 2

    If (value = = A [mid])

    return mid;

    else if (value < A [mid])

    high = mid 1;

    else low = mid + 1;

    }

    return 1;

    }

    3.16.1 Explanation of the Binary Search Algorithm

    It takes as parameter the array A, in which the value is to be searched. It

    also takes the lower and upper bounds of the array as parameters viz., low

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 161

    and high respectively. At each step of the integration of the while loop, the

    algorithm reduces the number of the elements of the array to be searched

    by half. If the value is found then its index is returned. However, if the value

    is not found by the algorithm, then the loop terminates if the value of the low

    exceeds the value of high, there will be no more items to be searched and

    hence the function returns a negative value to indicate that item is not

    found.

    3.16.2 Analysis

    As mentioned earlier, each step of the algorithm divides the block of items

    being searched in half. The presence or absence of an item in an array of n

    elements, can be established in at most lg n steps.

    Thus the running time of a binary search is proportional to lg n and we say

    this is a O (lg n) algorithm.

    3.17 Sorting

    We have already discussed the two sorting algorithms viz., Merge Sort and

    Quick Sort.

    The purpose of repeating the algorithm is mainly to discuss, not the design

    but, the analysis part.

    3.17.1 Merge Sort

    Merge Sort is a sorting algorithm which is based on the divide-and-conquer

    technique or paradigm. The Merge-Sort can be described in general terms

    as consisting of 3 major steps namely, a divide step, recursive step and

    merge step. These steps are described below in more detail.

    Divide Step: If given array A has zero or 1 element then return the array A,

    as it is trivially sorted. Otherwise, chop the given array A in almost the

    middle to give two subarrays A1 and A2, each of which containing about half

    of the elements in A.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 162

    Recursive Step: Recursively sort array A1 and A2.

    Merge Step: Though recursive application of we reach a stage when

    subarrays of sizes 2 and then of sizes 1 are obtained. At this stage two

    sublists of sizes 1 each are combined by placing the elements of the lists in

    sorted order. The process of this type of combinations of sublists is repeated

    on lists of larger and large sizes. To accomplish this step we will define a

    C++ function void merge (int A [ ]. Int p, int r).

    The recursion stops when the subarray has just only 1 element, so that it is

    trivially sorted. Below is the Merge Sort function in C++.

    void merge_sort (int A [ ], int p, int r)

    {

    if (p < r) // Check for base case

    {

    int q = (p + r) / 2 ; // Divide step

    merge_sort (A, p, q); // Conquer step

    merge_sort (A, q + 1, r); // Conquer step

    merge (A, p, q, r); } //Combine step

    Next, we define merge function which is called by the program merge-sort.

    At this stage, we have an Array A and indices p, q, r such that p < q < r.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 163

    Subarray A [p.q] is sorted and subarray A [q + 1 r] is sorted and by the

    restrictions on p, q, r, neither subarray is empty. We want that the two

    subarrays are merged into a single sorted subarray in A [p.r]. We will

    implement it so that it takes O(n) time, where n = r p + 1 the number of

    elements being merge.

    Let us consider two piles of cards. Each pile is sorted and placed face-up on

    a table with the smallest card on top of each pile. We will merge these into a

    single sorted pile, face-down on the table. A basic step will be to choose the

    smaller of the two top cards, remove it from its pile, thereby exposing a new

    top card and then placing the chosen card face-down onto the output pile.

    We will repeatedly perform these basic steps until one input becomes

    empty. Once one input pile empties, just take the remaining input pile and

    place it face-down onto the output pile. Each basic step should take

    constant time, since we check just the two top cards and there are n basic

    steps, since each basic step removes one card from the input piles, and we

    started with n cards in the input piles. Therefore, this procedure should take

    O (n) time. We dont actually need to check whether a pile is empty before

    each basic step. Instead we will put on the bottom of each input pile a

    special sentinel card. It contains a special value that we use to simplify the

    code. We know in advance that there are exactly rp+1 basic step. Below is

    the function merge which runs in O (n) time.

    Void merge (int A [ ], int p, int q, int r)

    {

    int n 1 = q p + 1

    int n 2 = r q:

    int * L = new int [n1 + 1];

    int * R = new int [n2 + 1];

    for (int i = 1; i < = n1; i++)

    L [ i ] = A [p + i 1];

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 164

    for (int j = 1; j < = n2; j++)

    R [ j ] = A [q + j];

    L[ 0 ] = INT_MIN; //negative infinity

    R[ 0 ] =INT_MIN; //negative infinity

    L[n1 + 1] = INT_MAX; //positive infinity

    R[n2 + 1] = INT_MAX; //positive infinity

    i = j = 1;

    for (int: k = p; k < = r ; k++)

    if (L[ i ] < = R[ j ])

    :

    A[ k ] = L[ i ];

    i + = 1;

    :

    else

    :

    A[ k ] = R[ j ];

    j + = 1;

    :

    }

    3.18 Quick Sort

    The purpose of discussing the Quick Sort Algorithm is to discuss its analysis.

    In the previous section, we discussed how the divide-and-conquer technique

    can be used to design sorting algorithm Merge-sort, as summarized below:

    Partition n elements array A into two subarrays of 2

    n elements each

    Sort the two subarrays recursively

    Merge the two subarrays

    Running time: nlogn2

    nT2nT

    .

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 165

    The Quick-Sort algorithm is obtained by exploring another possibility of

    dividing the elements such that there is no need of merging, that is

    Partition a [1..n] into subarrays q.......1AA and n.....1qAA

    such that all elements in A are larger than all elements in A .

    Recursively sort A and A .

    (nothing to combine / merge. A is already sorted after sorting A and A )

    Pseudo code for QUICKSORT:

    The algorithm PARTITION, which is called by QUICKSORT, is defined after

    a short while.

    Then, in order to sort an array A of n elements, we call QUICKSORT with

    the three parameters A, 1 and n QUICKSORT (A, 1, n).

    If 2

    nq and is n time, we again get the recurrence. If T (n) denotes the

    time taken by QUICKSORT in sorting an array of n elements.

    n2

    nT2nT

    . Then after solving the recurrence we get the running

    time as nlognnT

    The problem is that it is hard to develop partition algorithm which always

    divides A in two halves.

    QUICKSORT (A, p, r)

    If p < r THEN

    q = PARTITION (A, p, r)

    QUICKSORT (A, p, q 1)

    QUICKSORT (A, q + 1, r)

    end if

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 166

    r,p,APARTITION

    rAx

    1pi

    DO1rTOpjFOR

    THENrjAIF

    1ii

    jAandiAExchange

    ifend

    Doend

    rAand1IAExchange

    1iRETURN

    QUICKSORT correctness:

    Easy to show inductively, if PARTITION works correctly

    Example:

    2 8 7 1 3 5 6 4 i = 0, j = 1

    2 8 7 1 3 5 6 4 i = 1, j = 2

    2 8 7 1 3 5 6 4 i = 1, j = 3

    2 8 7 1 3 5 6 4 i = 1, j = 4

    2 1 7 8 3 5 6 4 i = 2, j = 5

    2 1 3 8 7 5 6 4 i = 3, j = 6

    2 1 3 8 7 5 6 4 i = 3, j = 7

    2 1 3 8 7 5 6 4 i = 3, j = 8

    2 1 3 4 7 5 6 8 q = 4

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 167

    3.18.1 Average running time

    The natural question: what is the average case running time of

    QUICKSORT?

    Is it close to worst case 2n , or to the best case nlogn ? Average time depends on the distribution of inputs for which we take the average.

    If we run QUICKSORT on a set of inputs that are already sorted, the

    average running time will be close to the worst-case.

    Similarly, if we run QUICKSORT on a set of inputs that give good splits,

    the average running time will be close to the best-case.

    If we run QUICKSORT on a set of inputs which are picked uniformly at

    random from the space of all possible input permutations, then the

    average case will also be close to the best-case. Why? Intuitively, if any

    input ordering is equally likely, then we expect at least as many good

    splits as bad splits, therefore on the average a bad split will be followed

    by a good split, and it gets absorbed in the good split.

    So, under the assumption that all input permutations are equally likely, the

    average time of QUICKSORT is nlogn (intuitively). Is this assumption

    realistic?

    Not really. In many cases the input is almost sorted: think of rebuilding

    indexes in a database etc.

    The question is: how can we make QUICKSORT have a good average time

    irrespective of the input distribution?

    Using randomization.

    3.18.2 Randomization Quick Sort*

    Next, we consider what we call randomized algorithms, that is, algorithms

    that make some random choices during their execution.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 168

    Running time of normal deterministic algorithm only depends on the input.

    Running time of randomized algorithm depends not only on input but also on

    the random choices made by the algorithm.

    Running time of a randomized algorithm is not fixed for a given input!

    Randomized algorithms have best-case and worst-case running times, but

    the inputs for which these are achieved are not known, they can be any of

    the inputs.

    We are normally interested in analyzing the expected running time of a

    randomized algorithm, that is the expected (average) running time for all

    inputs of size n

    XTEnTnxc

    3.18.3 Randomized Quick sort

    We can enforce that all n! permutations are equally likely by randomly

    permuting the input before the algorithm.

    Most computers have pseudo-random number generator random (1, n)

    returning random number between 1 and n.

    Using pseudo-random number generator we can generate a random

    permutation (such that all n! permutations equally likely) in O (n) time:

    Choose element in A[1] randomly among elements in A [1n],

    Choose element in A[2] randomly among elements in A [2..n], choose

    Element in A [3] randomly among elements in A [3..n] and so on.

    Alternatively we can modify PARTITION slightly and exchange last

    element in A with random element in A before partitioning.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 169

    3.18.4 Expected Running Time of Randomized Quick sort

    Let T(n) be the running of RANDQUICKSORT for an input of size n.

    Running time of RANDQUICKSORT is the total running time spent in all

    PARTITION calls.

    PARTITION is called n times

    The pivot element r is not included in any recursive calls.

    One call of PARTITION takes 1O time plus time proportional to the

    number of iterations FOR-loop.

    In each iteration of FOR-loop we compare an element with the pivot

    element.

    If X is the number of comparisons rjA performed in PARTITION over

    the entire execution of RANDQUICKSORT then the running time is

    O (n + X).

    E [T (n) ] = E [O (n+X) ] = n + E [X]

    RAND PARTITION (A, p, r)

    i = RANDOM (p, r)

    Exchange A|r| and A[i]

    RETURN PARTITION (A, p, r)

    RAND PARTITION (A, p, r)

    IF p < r THEN

    q = RANDPARTITION (A, p, r)

    RANDQUICKSORT (A, p, q 1, r)

    END IF

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 170

    To analyze the expected running time we need to compute E[X]

    To compute X we use z1, z2, zn to denote the elements in A where z1

    is the ith smallest element. We also use zij to denote {zI, zj+1, . zj}.

    Each pair of elements zi and zj are compared at most once (when either of

    them is the pivot)

    n

    1ij ij1n

    1iXX where

    jztocomparednotizIf0

    jztocomparedizIf1X ij

    n

    1ij ij1n

    1iXEXE

    n

    1ij ij1n

    1iXE

    n

    1ij

    1n

    1i jiztocomparedzPr

    To compute Pr [zi compared to zj] it is useful to consider when two elements

    are not compared.

    Example: Consider an input consisting of numbers 1 through n.

    Assume first pivot it 7 first partition separates the numbers into set

    10,9,8and6,5,4,3,2,1 .

    In partitioning 7 is compared to all numbers. No number from the first set will

    ever be compared to a number from the second set.

    In general once a pivot ji zrz,r , is chosen we know that zi and zj cannot

    later be compared.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 171

    On the other hand if zi is chosen as pivot before any other element in zij then

    it is compared to each element in zij. Similar for zj.

    In example 7 and 9 are compared because 7 is first item from Z7, 9 to be

    chosen as pivot and 2 and 9 are not compared because the first pivot in Z2, 9

    is 7.

    Prior to an element in Zij being chosen as pivot the set Zij is together in the

    same partition any element in Zij is equally likely to be first element

    chosen as pivot the probability that zI or zj is chosen first in Zij is 1ij

    1

    1ij

    2ztocomparedzPr ji

    We now have:

    n

    1ij

    1n

    1i jiztocomparedzPrXE

    n

    1ij

    1n

    1i 1ij

    2

    in

    1k

    1n

    1i 1k

    2

    in

    1k

    1n

    1i k

    2

    1n

    1inlogO

    nlognO

    Since best case is nlgnXEnlognO and therefore

    nlgnnTE . Next time we will see how to make quick sort run in

    worst-case nlognO time.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 172

    3.19 Matrix Multiplication

    In number of problems, matrix multiplications form core of algorithms to

    solve the problems under consideration. Any saving in time complexity may

    lead to significant improvement in the overall algorithm. The conventional

    algorithm makes O(n3) integer multiplications. Next, we discuss Strassens

    Algorithm which makes only O(n2.8) integer multiplications for multiplying 2

    n n matrices.

    3.19.1 Strassens Algorithm

    Strassens recursive algorithm for multiplying n n matrices runs in

    81.27lg nOn time. For sufficiently large value of n, therefore, it

    outperforms the 3n matrix-multiplication algorithm.

    The idea behind the Strassens algorithm is to multiply 22 matrices with

    only 7 scalar multiplications (instead of 8). Consider the matrices

    h

    g

    f

    e

    dc

    ba

    ut

    sr

    The seven sub matrix products used are

    P1 = a .(g-h)

    P2 = (a+b) h

    P3 = (c+d) e

    P4 = d. (fe)

    P5 = (a + d). (e + h)

    P6 = (b d). (f + h)

    P7 = (a c) . (e+g)

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 173

    Using these sub matrix products the matrix products are obtained by

    r = P5 + P4 P2 + P6

    s = P1 + P2

    t = P3 + P4

    u = P5 + P1 P3 P1

    This method works as it can be easily seen that s=(agah)+(ah+bh)=ag+bh.

    In this method there are 7 multiplications and 18 additions. For (nn)

    matrices, it can be worth replacing one multiplication by 18 additions, since

    multiplication costs are much more than addition costs.

    The recursive algorithm for multiplying nn matrices is given below:

    1. Partition the two matrices A, B into 2

    n

    2

    n matrices.

    2. Conquer: Perform 7 multiplications recursively.

    3. Combine: Form using + and .

    The running time of above recurrence is given by the recurrence given

    below:

    2n2

    nT7nT

    7lgn

    81.2nO

    The current best upper bound for multiplying matrices is approximately

    376.2nO

    3.19.2 Limitations of Strassens Algorithm

    From a practical point of view, Strassens algorithm is often not the method

    of choice for matrix multiplication, for the following four reasons:

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 174

    1. The constant factor hidden in the running time of Strassens algorithm is

    larger than the constant factor in the 3n method. 2. When the matrices are sparse, methods tailored for sparse matrices are

    faster.

    3. Strassens algorithm is not quite as numerically stable as the nave method.

    4. The sub matrices formed at the levels of recursion consume space.

    3.20 Splay tree

    A splay tree is a self-balancing binary search tree with the additional

    unusual property that recently accessed elements are quick to access

    again. It performs basic operations such as insertion, look-up and removal in

    nO log amortized time. For many non-uniform sequences of operations,

    splay trees perform better than other search trees, even when the specific

    pattern of the sequence in unknown. The splay tree was invented by Daniel

    Sleator and Robert Tarjan.

    All normal operations on a binary search tree are combined with one basic

    operation, called splaying. Splaying the tree for a certain element

    rearranges the tree so that the element is placed at the root of the tree. One

    way to do this is to first perform a standard binary tree search for the

    element in question, and then use tree rotations in a specific fashion to bring

    the element to the top. Alternatively, a bottom-up algorithm can combine the

    search and the tree reorganization.

    3.20.1 Advantages and disadvantages

    Good performance for a splay tree depends on the fact that it is self-

    balancing, and indeed self optimizing, in that frequently accessed nodes will

    move nearer to the root where they can be accessed more quickly. This is

    an advantage for nearly all practical applications, and is particularly useful

    for implementing caches; however it is important to note that for uniform

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 175

    access, a splay trees performance will be considerably (although not

    asymptotically) worse than a somewhat balanced simple binary search tree.

    Splay trees also have the advantage of being considerably simpler to

    implement than other self-balancing binary search trees, such as red-black

    trees or AVL trees, while their average-case performance is just as efficient.

    Also, splay trees dont need to store any bookkeeping data, thus minimizing

    memory requirements. However, these other data structures provide worst-

    case time guarantees, and can be more efficient in practice for uniform access.

    One worst case issue with the basic splay tree algorithm is that of

    sequentially accessing all the elements of the tree in the sort order. This

    leaves the tree completely unbalanced (this takes n accesses- each an O(1)

    operation). Reaccessing the first item triggers an operation that takes O(n)

    operations to rebalance the tree before returning the first item. This is a

    significant delay for that final operation, although the amortized performance

    over the entire sequence is actually O(1). However, recent research shows

    that randomly rebalancing the tree can avoid this unbalancing effect and

    give similar performance to the other self-balancing algorithms.

    It is possible to create a persistent version of splay trees which allows

    access to both the previous and new versions after an update. This requires

    amortized O(log n) space per update.

    3.20.2 The splay operation

    When a node x is accessed, a splay operation is performed on x to move it

    to the root. To perform a splay operation we carry out a sequence of splay

    steps, each of which moves x closer to the root. As long as x has a

    grandparent, each particular step depends on two factors:

    Whether x is the left or right child of its parent node, p,

    Whether p is the left or right child of its parent, g (the grandparent of x).

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 176

    Thus, there are four cases when x has a grandparent. They fall into two

    types of splay steps.

    3.20.3 Zig-zag step: One zig-zag case is when x is the right child of p and p

    is the left child of g (shown above). p is the new left child of x, g is the new

    right child of x, and the subtrees A, B, C, and D of x, p, and g are rearranged

    as necessary. The other zig-zag case is the mirror image of this, i.e. when x

    is the left child of p and p is the right child of g. Note that a zig-zag step is

    equivalent to doing a rotation on the edge between x and p, then doing a

    rotation on the edge between p and g.

    A

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 177

    Zig-zag step: One zig-zag case is when x is the left child of p and p is the

    left child of g (shown above). p is the new right child of x, g is the new right

    child of x, and the subtrees A, B, C, and D of x, p, and g are rearranged as

    necessary. The other zig-zag case is the mirror image of this, i.e. when x is

    the right child of p and p is the right child of g. Note that a zig-zag steps are

    the only thing that differentiate splay trees from the rotate to root method

    introduced by Allen and Munro prior to the introduction of splay trees.

    3.21 Zig Step

    There is also a third kind of splay step that is done when x has a parent p

    but no grandparent. This is called a zig step and is simply a rotation on the

    edge between x and p. Zig steps exist to deal with the parity issue and will

    be done only as the last step in a splay operation and only when x has odd

    depth at the beginning of the operation.

    By performing a splay operation on the node of interest after every access,

    we keep recently accessed nodes near the root and keep the tree roughly

    balanced, so that we achieve the desired amortized time bounds.

  • Analysis and Design of Algorithms Unit 3

    Sikkim Manipal University Page No.: 178

    Self Assessment Questions

    1. Explain briefly stacks and queues.

    2. What do you mean by Hash tables.

    3. Explain in your own words the concep