topic 7: binary trees1 · 2012. 12. 10. · topic 7: binary trees1 (version of november 28, 2011)...

52
Topic 7: Binary Trees 1 (Version of November 28, 2011) Pierre Flener Computing Science Division Department of Information Technology Uppsala University Sweden Course 1DL201: Program Construction and Data Structures 1 Based on original slides by Yves Deville, and with pictures by John Morris, Justin Pearson, and from the Wikipedia

Upload: others

Post on 19-Feb-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

  • Topic 7: Binary Trees1(Version of November 28, 2011)

    Pierre Flener

    Computing Science DivisionDepartment of Information Technology

    Uppsala UniversitySweden

    Course 1DL201:Program Construction and Data Structures

    1Based on original slides by Yves Deville, and with pictures by JohnMorris, Justin Pearson, and from the Wikipedia

    http://user.it.uu.se/~pierref/

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Outline

    1 Binary Trees

    2 Binary Search Trees

    3 Balanced Binary Search TreesAVL TreesRed-Black Trees

    4 Conclusion

    Course 1DL201 - 2 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Outline

    1 Binary Trees

    2 Binary Search Trees

    3 Balanced Binary Search TreesAVL TreesRed-Black Trees

    4 Conclusion

    Course 1DL201 - 3 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Binary Trees

    Binary trees of elements of arbitrary type: ’a bTree

    3

    1

    0 2

    7

    5

    4 6

    8

    TerminologyNode, root,leaf (plural: leaves),internal (inner) nodeLeft and right subtreeParent, left and rightchild, siblingEdge, path, branchLevel

    Graphical Representation ConventionEmpty trees are not drawn (but they consume memory).

    Course 1DL201 - 4 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Value and Some Operations

    empty

    TYPE: ’a bTree

    VALUE: the empty binary tree

    isEmpty T

    TYPE: ’’a bTree -> bool

    PRE: (none)

    POST: true if T is empty, and false otherwise

    cons (r, L, R)

    TYPE: ’a * ’a bTree * ’a bTree -> ’a bTree

    PRE: (none)

    POST: the binary tree with root r, left subtree L,

    and right subtree R

    Course 1DL201 - 5 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Some More Operations

    left T

    TYPE: ’a bTree -> ’a bTree

    PRE: T is non-empty

    POST: the left subtree of T

    right T

    TYPE: ’a bTree -> ’a bTree

    PRE: T is non-empty

    POST: the right subtree of T

    root T

    TYPE: ’a bTree -> ’a

    PRE: T is non-empty

    POST: the root of T

    Course 1DL201 - 6 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Representation and Implementation

    datatype ’a bTree = Void

    | Bt of ’a * ’a bTree * ’a bTree

    REPRESENTATION CONVENTION: the empty binary tree is represented by Void;

    a binary tree with root r, left subtree L, and right subtree R

    is represented by Bt(r,L,R)

    REPRESENTATION INVARIANT: (none)

    EX: Bt(3, Bt(1, Bt(0,Void,Void), Bt(2,Void,Void)),

    Bt(7, Bt(5, Bt(4,Void,Void), Bt(6,Void,Void)), Bt(8,Void,Void)))

    where bTree is a type constructorwhile Void and Bt are value constructors.

    val empty = Void

    fun isEmpty T = (T = Void)

    fun cons (r, L, R) = Bt(r,L,R)

    fun left (Bt(r,L,R)) = L

    fun right (Bt(r,L,R)) = R

    fun root (Bt(r,L,R)) = r

    Exercise: All these operations always take Θ(1) time.Course 1DL201 - 7 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Walks

    DefinitionA walk of a data structure is a way of listing each of itselements exactly once. For binary trees, we distinguish:

    the preorder walk (first list the root, thenwalk the left subtree, and last walk the right subtree)the inorder walk (left, root, right)the postorder walk (left, right, root)

    ExampleFor the binary tree on page 4 :

    preorder walk = 3 1 0 2 7 5 4 6 8inorder walk = 0 1 2 3 4 5 6 7 8 (by coincidence?)postorder walk = 0 2 1 4 6 5 8 7 3

    Course 1DL201 - 8 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Inorder Walk

    inorder T

    TYPE: ’a bTree -> ’a list

    PRE: (none)

    POST: inorder walk of T

    VARIANT: |T|

    fun inorder Void = []

    | inorder (Bt(r,L,R)) =

    (inorder L) @ (r :: inorder R)

    Double recursion, but no tail recursion.Program uses

    no

    X@Y , which takes Θ(|X |) time.Exercise: inorder T takes:– Θ(|T|) time at best (when L is always empty).– Θ(|T| · lg |T|) time on average (when always |L| = |R|).– Θ(|T|2) time at worst (when R is always empty).

    Course 1DL201 - 9 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Generalisation by Accumulator Introduction

    inorder’ (T, A)

    TYPE: ’a bTree * ’a list -> ’a list

    PRE: (none)

    POST: (inorder walk of T) @ A

    VARIANT: |T|

    fun inorder’ (Void, A) = A

    | inorder’ (Bt(r,L,R), A) =

    inorder’ (L, r :: inorder’ (R, A))

    fun inorder T = inorder’ (T, [])

    Double recursion, but one tail recursion.Program uses no X@Y , but the specif. of inorder’ does.Exercise: inorder’ (T, A) and thus the new version ofinorder T always take Θ(|T|) time, independently of |A|.Exercise: Implement efficient preorder and postorder walksof a binary tree, and analyse the designed algorithms.

    Course 1DL201 - 10 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Structural Generalisation

    inorders Ts

    TYPE: ’a bTree list -> ’a list

    PRE: (none)

    POST: (inorder walk of T1) @ ... @ (inorder walk of Tm),

    when Ts = [T1,...,Tm]

    VARIANT: a tree is replaced by 0, 1, or 2 smaller trees

    fun inorders [] = []

    | inorders (Void::Ts) = inorders Ts

    | inorders (Bt(r,Void,R)::Ts) = r::(inorders (R::Ts))

    | inorders (Bt(r,L,R)::Ts) =

    inorders (L::Bt(r,Void,R)::Ts)

    fun inorder T = inorders [T]

    Single recursion: it is even a tail recursion in two clauses.Exercise: For binary trees with a total of n nodes, inordersand the new version of inorder always take Θ(n) time.

    Course 1DL201 - 11 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Structural Generalisation: Sample Trace

    inorders [Bt(3,Bt(1,Void,Void),Bt(7,Void,Void))]

    = inorders [Bt(1,Void,Void), Bt(3,Void,Bt(7,Void,Void))]

    by fourth clause= 1 :: inorders [Void, Bt(3,Void,Bt(7,Void,Void))]

    by third clause= 1 :: inorders [Bt(3,Void,Bt(7,Void,Void))]

    by second clause= 1 :: 3 :: inorders [Bt(7,Void,Void)]

    by third clause= 1 :: 3 :: 7 :: inorders [Void]

    by third clause= 1 :: 3 :: 7 :: inorders []

    by second clause= 1 :: 3 :: 7 :: []

    by first clause= [1,3,7]

    by definition

    Course 1DL201 - 12 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Even More Operations

    exists (T, k)

    TYPE: ’’a bTree * ’’a -> bool

    PRE: (none)

    POST: true if T contains node k, and false otherwise

    insert (T, k)

    TYPE: ’a bTree * ’a -> ’a bTree

    PRE: (none)

    POST: T with node k

    delete (T, k)

    TYPE: ’’a bTree * ’’a -> ’’a bTree

    PRE: (none)

    POST: if k exists in T, then T without one occurrence

    of node k, otherwise T

    Course 1DL201 - 13 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Yet More Operations

    nbNodes T

    TYPE: ’a bTree -> int

    PRE: (none)

    POST: the number of nodes of T

    nbLeaves T

    TYPE: ’a bTree -> int

    PRE: (none)

    POST: the number of leaves of T

    ExercisesImplement efficient algorithms for these five functions.Show that these algorithms at worst take Θ(|T|) time,if not Θ(1) time.

    Course 1DL201 - 14 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Height

    DefinitionThe height of a node is the length of the longest path(measured in its number of nodes) from that node to a leaf.The height h(T ) of a tree T is the height of the root of T .

    Example: The binary tree on page 4 has height 4.

    height T

    TYPE: ’a bTree -> int

    PRE: (none)

    POST: h(T)VARIANT: h(T) (* note that |T| is also a variant *)fun height Void = 0

    | height (Bt(r,L,R)) = 1 + Int.max (height L, height R)

    Double recursion, but no tail recursion.Exercise: height T always takes Θ(|T|) time.

    Course 1DL201 - 15 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Generalisation by Accumulator Introduction

    Note that height’ (T, a) with POST: a + h(T)does not suffice to get a tail recursion: Why?!

    height’ (T, a, hMax)

    TYPE: ’a bTree * int * int -> int

    PRE: (none)

    POST: max(a + h(T), hMax)

    VARIANT: h(T)fun height’ (Void, a, hMax) = Int.max (a, hMax)

    | height’ (Bt(r,L,R), a, hMax) =

    height’ (L, a+1, height’ (R, a+1, hMax))

    fun height T = height’ (T, 0, 0)

    Double recursion, but one tail recursion.Exercise: height’(T,a,hMax), and thus the new version ofheight T, also always takes Θ(|T|) time (independently of aand hMax), but less space than the old version of height T.

    Course 1DL201 - 16 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Outline

    1 Binary Trees

    2 Binary Search Trees

    3 Balanced Binary Search TreesAVL TreesRed-Black Trees

    4 Conclusion

    Course 1DL201 - 17 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Binary Search Trees

    Binary search trees of elements with integer keys andvalues of arbitrary type: ’a bsTreeWe specialise binary trees by adding the search invariant:

    REPRESENTATION INVARIANT: for a binary search tree

    with (k,v) in the root, left subtree L, and right subtree R:

    - every element of L has a key smaller than k

    - every element of R has a key larger than k

    and recursively so on, for L and R

    Example: The binary tree on page 4 is a binary search tree(whose values are not depicted).

    Note that we (arbitrarily) ruled out duplicate keys.

    Benefit: The inorder walk of a binary search tree lists itsnodes by increasing order of their keys.

    Question: Do we now have linear-time sorting?!

    Course 1DL201 - 18 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Representation and Implementation

    datatype ’b bsTree = Void

    | Bst of (int * ’b) * ’b bsTree * ’b bsTree

    REPRESENTATION CONVENTION: the empty binary search tree is represented

    by Void; a binary search tree with key-value pair (k,v) in the root,

    left subtree L, and right subtree R is represented by Bst((k,v),L,R)

    REPRESENTATION INVARIANT: (see previous page)

    empty

    TYPE: ’b bsTree

    VALUE: the empty binary search tree

    val empty = Void

    isEmpty T

    TYPE: ’’b bsTree -> bool

    PRE: (none)

    POST: true if T is empty, and false otherwise

    TIME COMPLEXITY: Θ(1) alwaysfun isEmpty T = (T = Void)

    Course 1DL201 - 19 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Existence Check

    exists (T, k)

    TYPE: ’b bsTree * int -> bool

    PRE: (none)

    POST: true if T contains a node with key k,

    and false otherwise

    VARIANT: h(T)TIME COMPLEXITY: Θ(|T|) at worst (when h(T)=|T|)fun exists (Void, k) = false

    | exists (Bst((key,value),L,R), k) =

    if k = key then true

    else if k < key then exists (L, k)

    else exists (R, k)

    Course 1DL201 - 20 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Search

    search (T, k)

    TYPE: ’b bsTree * int -> ’b

    PRE: k exists in T

    POST: the value associated to key k in T

    VARIANT: h(T)TIME COMPLEXITY: Θ(|T|) at worst (when h(T)=|T|)fun search (Bst((key,value),L,R), k) =

    if k = key then value

    else if k < key then search (L, k)

    else search (R, k)

    Course 1DL201 - 21 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Insertion

    Compare with the specification for binary trees on page 13 ,so as to exploit the assumed absence of duplicates inbinary search trees:

    insert (T, k, v)

    TYPE: ’b bsTree * int * ’b -> ’b bsTree

    PRE: (none)

    POST: if k exists in T, then T with v as value for key k,

    else T with node (k,v)

    VARIANT: h(T)TIME COMPLEXITY: Θ(|T|) at worst (when h(T)=|T|)fun insert (Void, k, v) = Bst((k,v),Void,Void)

    | insert (Bst((key,value),L,R), k, v) =

    if k = key then Bst((k,v),L,R)

    else if k < key then Bst((key,value), insert (L, k, v), R)

    else Bst((key,value), L, insert (R, k, v))

    Course 1DL201 - 22 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Deletion Issues

    When deleting a node (key,value) whose subtrees Land R are both non-empty, we must not violate the searchinvariant.

    One option is:1 Replace (key,value) by the node with the maximal

    key of L, as that key is smaller than the key of anynode of R.

    2 Remove this replacement node from L.

    The other option is to replace (key,value) by the nodewith the minimal key of R, as that key is larger than the keyof any node of L, and to remove that replacement nodefrom R.

    Course 1DL201 - 23 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Help Function for Deletion

    So we need a help function:

    extractMax T

    TYPE: ’b bsTree -> (int * ’b) * ’b bsTree

    PRE: T is non-empty

    POST: (max, T’), where max is the node of T with

    the maximal key, and T’ is T without max

    VARIANT: the number of elements larger than the root of T

    TIME COMPLEXITY: Θ(|T|) at worst (when h(T)=|T|)fun extractMax (Bst(r,L,Void)) = (r, L)

    | extractMax (Bst(r,L,R)) =

    let val (max, newR) = extractMax R

    in (max, Bst(r,L,newR)) end

    Note that h(T) would not be a helpful variant, as it does nothelp identify the base case, although it does get strictlysmaller in the recursive call.

    Course 1DL201 - 24 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Deletion

    Compare with the specification for binary trees on page 13 :

    delete (T, k)

    TYPE: ’b bsTree * int -> ’b bsTree

    PRE: (none)

    POST: if k exists in T, then T without the node with key k,

    else T

    ALGORITHM: when both subtrees of node of key k are non-empty, replace

    that node by the node of maximal key in its left subtree

    VARIANT: h(T)TIME COMPLEXITY: Θ(|T|) at worst (when h(T)=|T|)fun delete (Void, k) = Void

    | delete (Bst((key,value),L,R), k) =

    if k < key then Bst((key,value), delete (L, k), R)

    else if k > key then Bst((key,value), L, delete (R, k))

    else (* k = key *)

    case (L,R) of

    (Void,_) => R

    | (_,Void) => L

    | (_, _) => let val (max, newL) = extractMax L

    in Bst(max,newL,R) end

    Course 1DL201 - 25 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Observations

    The search time in a binary search tree depends on theshape of the tree, that is on the order in which its elementswere inserted.The A pathological case: The n elements are inserted byincreasing order on the keys, yielding something like alinear list (but with a worse space complexity), with Θ(n)search time at worst. Consider the tree on the right:

    Course 1DL201 - 26 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    More Observations

    The height h of a binary (search) tree of n elements is suchthat lg n < h ≤ n. Hence search, retrieval, insertion, anddeletion at worst take Θ(n) time, which is poor, so we couldhave decided not even to implement these functions.

    The height of a randomly built (via insertions only, all keysbeing of equal probability) binary search tree of n elementsis Θ(lg n).

    In practice, one can however not always guarantee thatbinary search trees are built randomly. Binary search treesare thus only interesting when they are “relatively complete.”

    So we must look for a further specialisation of binary searchtrees, whose worst-case performance on the basic treeoperations can be guaranteed to be logarithmic at most.

    Course 1DL201 - 27 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Outline

    1 Binary Trees

    2 Binary Search Trees

    3 Balanced Binary Search TreesAVL TreesRed-Black Trees

    4 Conclusion

    Course 1DL201 - 28 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Balanced Binary Search Trees

    “Definition”: A balanced tree is a tree where no leaf is“more than a certain distance” away from the root than anyother leaf.The balancing invariants defining “not more than a certaindistance” differ between various kinds of balanced trees:

    AVL trees (not studied in this course)Red-black trees (focus of this course). . .

    Insertion and deletion involve transforming the tree if itsbalancing invariant is violated.These re-balancing transformations must take Θ(lg n) timeat worst, so that the effort is worth it. These transformationsare built from operators (introduced on the next page) thatare independent of the balancing invariant.

    Course 1DL201 - 29 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Rotations of Binary Trees

    DefinitionA rotation of a binary tree transforms the tree so that itsinorder walk is preserved. (For a binary search tree, thesearch invariant is thus automatically preserved.)We distinguish left rotation and right rotation:

    inorder walk = A x B y Cpreorder walk = y x A B C preorder walk = x A y B C

    postorder walk = A B x C y postorder walk = A B C y x

    Exercise: Implement both rotations to take Θ(1) time.Course 1DL201 - 30 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Outline

    1 Binary Trees

    2 Binary Search Trees

    3 Balanced Binary Search TreesAVL TreesRed-Black Trees

    4 Conclusion

    Course 1DL201 - 31 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    AVL Trees (Adel’son-Velskii & Landis, 1962)

    AVL trees, the first dynamically balanced trees, are notperfectly balanced, but guarantee Θ(lg n) worst-casesearch, insertion, deletion times for trees of initially n nodes.

    DefinitionAn AVL tree is a binary search tree with balancing invariant:if ` and r are the heights of the left and right subtrees,then −1 ≤ r − ` + 1:

    and conversely (when the left and right subtrees areexchanged) or when both subtrees are of height h − 1.

    Course 1DL201 - 32 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Two Examples and One Counter-Example

    Let us annotate each node with a balance factor for the treerooted at the considered node:• if the tree is stable (when r − ` = 0)− if the tree is left-heavy (when r − ` = −1)+ if the tree is right-heavy (when r − ` = +1)−− if the tree is left-unbalanced (when r − ` < −1)++ if the tree is right-unbalanced (when r − ` > +1)where ` and r are the heights of the left and right subtrees.

    Course 1DL201 - 33 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    What Can We Expect from AVL Trees?

    Key QuestionsWhat is the maximum height hMax(n)of an AVL tree with n nodes?What is the minimum number nMin(h) of nodesof an AVL tree of height h?

    Equivalent questions, but the second is easier to answer.

    Recurrence:

    nMin(h) =

    0 if h = 01 if h = 11 + nMin(h − 1) + nMin(h − 2) if h > 1

    Course 1DL201 - 34 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Search

    Compare the nMin series with the Fibonacci series:

    h 0 1 2 3 4 5 6 7 8 . . .nMin(h) 0 1 2 4 7 12 20 33 54 . . .fib(h) 0 1 1 2 3 5 8 13 21 . . .

    Observe: nMin(h) = fib(h + 2)− 1. (Exercise: Prove this.)Equivalently, the maximum height hMax(n) of an AVL treewith n elements is the largest h such that:

    fib(h + 2)− 1 ≤ n

    which simplifies into hMax(n) ≤ 1.44 · lg(n + 1)− 1.33,so that search in an AVL tree takes Θ(lg n) time at worst.Note that the exists and search algorithms for binarysearch trees work unchanged for AVL trees.

    Course 1DL201 - 35 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Insertion

    How to insert — in logarithmic time — an element into anAVL tree such that it remains an AVL tree?

    After locating the insertion place and performing a standardbinary-search-tree insertion, there are only five cases:

    1 Every subtree remains balanced (•, −, or +): done.2 A left-heavy subtree became left-unbalanced (−−):

    a The balanced left subtree became left-heavy (−):right-rotate the left subtree toward the root.

    b The balanced left subtree became right-heavy (+):first left-rotate the right subtree of the left subtreetoward its parent, and then right-rotate the left subtreetoward the root.

    3 A right-heavy subtree became right-unbalanced (++):symmetric Cases 3a and 3b to Cases 2a and 2b above.

    Course 1DL201 - 36 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Insertion: Case 2a

    Right-rotate left-heavy pivot 3 toward left-unbalanced root 5:

    The inserted element is 2 (if C and D are empty) or a leaf of C or D.The trees A and B have height h, while the tree rooted at 2 had height hbefore the insertion and obtained height h + 1 after the insertion.

    Course 1DL201 - 37 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Insertion: Case 2b

    First left-rotate the stable pivot 4 toward the right-heavyroot 3, and then right-rotate the now left-heavy pivot 4toward the still left-unbalanced root 5:

    The inserted element is 4 (if C and D are empty) or a leaf of C or D.The trees A and B have height h, while the tree rooted at 4 had height hbefore the insertion and obtained height h + 1 after the insertion.

    Course 1DL201 - 38 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Insertion and Deletion

    Insertion Property

    An insertion includes re-balancing⇒ (:)

    that insertion does not modify the height of the tree.

    + Hence it suffices to re-balance the tree once,namely at the imbalance that is furthest from the root.

    Insertion: Insertion requires at most two walks of the pathfrom the root to the added element, plus at most twoconstant-time rotations, hence insertion indeed takesΘ(lg n) time at worst on an AVL tree of initially n nodes.

    Deletion: The deletion of a node of given key from an AVLtree of initially n nodes can also be performed in Θ(lg n)time at worst. (The algorithm is not studied in this course.)

    Course 1DL201 - 39 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Outline

    1 Binary Trees

    2 Binary Search Trees

    3 Balanced Binary Search TreesAVL TreesRed-Black Trees

    4 Conclusion

    Course 1DL201 - 40 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Red-Black Trees (Guibas&Sedgewick, 1978)

    Red-black trees are dynamically balanced: they are notperfectly balanced (see Figure 13.1(a) on page 310 ofCLRS3), but they guarantee Θ(lg n) worst-case search,insertion, deletion times for trees of initially n nodes.

    Definition (and Example)A red-black tree is a binary search tree where every node iscoloured either red or black, with the balancing invariants:

    i No red node has a red parent.ii Every path from the root to an

    empty tree contains the samenumber of black nodes.

    11

    10 12

    4

    datatype colour = Red | Black

    datatype rbTree =

    Void | RBT of colour * rbTree * int * rbTreeCourse 1DL201 - 41 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Insertion (algorithm of Okasaki, 1999)

    PropertyA red-black tree of n nodes has height at most 2 · lg(n + 1),because the longest possible path, which is alternatingblack and red, is no more than twice as long as the shortestpossible path, which is all black.

    How to insert, in O(lg n) time at worst, an element into ann-node red-black tree such that it remains a red-black tree?

    1 Perform a standard binary-search-tree insertion.2 Colour the new node red.3 Rebalance the tree, if need be:

    A There is a red node with a red parent and a blackgrandparent. This can happen in four different ways.

    B There is no such red node.

    4 Colour the root black.Course 1DL201 - 42 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Insertion

    fun insert (k, T) =

    let

    fun ins Void = RBT(Red,Void,k,Void)

    | ins (T as RBT(col,L,key,R)) =

    if k < key then

    balance (col, ins L, key, R)

    else if k > key then

    balance (col, L, key, ins R)

    else (* k = key *) T

    val RBT(_,L,key,R) = ins T

    in

    RBT(Black,L,key,R)

    end

    Course 1DL201 - 43 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Insertion: Case A1 of Rebalancing

    Right-rotate z:

    d

    z y

    x z

    a b c dx

    y

    a b

    c

    fun balance(Black,RBT(Red,RBT(Red,a,x,b),y,c),z,d) =

    RBT(Red,RBT(Black,a,x,b),y,RBT(Black,c,z,d))

    | ...

    Course 1DL201 - 44 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Insertion: Case A2 of Rebalancing

    First left-rotate x and then right-rotate z:

    a

    b c

    d

    z

    x

    y

    y

    x z

    a b c d

    ...

    | balance(Black,RBT(Red,a,x,RBT(Red,b,y,c)),z,d) =

    RBT(Red,RBT(Black,a,x,b),y,RBT(Black,c,z,d))

    | ...

    Course 1DL201 - 45 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Insertion: Case A3 of Rebalancing

    First right-rotate z and then left-rotate x :y

    x z

    a b c d

    a

    x

    y

    z

    d

    b c

    ...

    | balance(Black,a,x,RBT(Red,RBT(Red,b,y,c),z,d)) =

    RBT(Red,RBT(Black,a,x,b),y,RBT(Black,c,z,d))

    | ...

    Course 1DL201 - 46 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Insertion: Case A4 of Rebalancing

    Left-rotate x :y

    x z

    a b c d

    dc

    b

    a

    x

    y

    z

    ...

    | balance(Black,a,x,RBT(Red,b,y,RBT(Red,c,z,d))) =

    RBT(Red,RBT(Black,a,x,b),y,RBT(Black,c,z,d))

    | ...

    Course 1DL201 - 47 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Insertion: Case B of Rebalancing

    In all other situations, do nothing:

    ...

    | balance body = RBT body

    NoteThe resulting insertion and rebalancing algorithm, designedunder a functional programming mindset, is much moreelegant and compact, at no loss of speed, than the classicalalgorithm (in the textbook), which was designed under animperative programming mindset!

    Course 1DL201 - 48 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Insertion: Example

    Insert 4 into the empty tree:

    4 step 3: skipped−−−−−−−−−→ 4 step 4−−−→ 4

    Insert 5 into the resulting tree:

    4

    5 step 3: case B−−−−−−−−→

    4

    5 step 4−−−→

    4

    5

    Insert 6 into the resulting tree:

    4

    5

    6 step 3: B−−−−−→

    4

    5

    6 step 3: A4−−−−−−→left-rotate 4

    5

    4 6 step 4−−−→

    5

    4 6

    Course 1DL201 - 49 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch TreesAVL Trees

    Red-Black Trees

    Conclusion

    Insertion: Example (end)

    Insert 3 into the resulting tree:

    5

    4

    3

    6step 3: case B,step 3: case B−−−−−−−−→

    5

    4

    3

    6

    step 4−−−→

    5

    4

    3

    6

    Insert 2 into the resulting tree:

    5

    4

    3

    2

    6

    step 3: B−−−−−→

    5

    4

    3

    2

    6

    step 3: A1,step 3: B,

    step 4−−−−−→right-rot 4

    5

    3

    2 4

    6

    Course 1DL201 - 50 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    Outline

    1 Binary Trees

    2 Binary Search Trees

    3 Balanced Binary Search TreesAVL TreesRed-Black Trees

    4 Conclusion

    Course 1DL201 - 51 - Program Construction and Data Structures

  • Binary Trees

    BinarySearch Trees

    BalancedBinarySearch Trees

    Conclusion

    When to Use Balanced Search Trees?

    Dynamically balanced binary search trees trees areinteresting when:

    The number n of elements is large (say n ≥ 50),andThe keys are (suspected of) not appearing randomly,andThe ratio of the expected number s of searches to theexpected number i of insertions is large enough (says/i ≥ 5) to justify the costs of dynamic re-balancing.

    The numbers 50 and 5 depend on the computerarchitecture.

    Course 1DL201 - 52 - Program Construction and Data Structures

    Binary TreesBinary Search TreesBalanced Binary Search TreesAVL TreesRed-Black Trees

    Conclusion