fibonacci heaps and binary heaps

20
Term paper of DATA STRUCTURES CSE:205 TERM PAPER TOPIC :- Compare the practical performance of Fibonacci heaps and binary heaps.

Upload: ashishpatel99

Post on 18-Nov-2014

170 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Fibonacci Heaps and Binary Heaps

Term paper of DATA STRUCTURES

CSE:205

TERM PAPER TOPIC:- Compare the practical performance of

Fibonacci heaps and binary heaps.

SUBMITTED TO: SUBMITTED BY:

Page 2: Fibonacci Heaps and Binary Heaps

Mr. Vijay Kumar JAIDEV PANWAR

B2802A14

10811015

ACKNOWLEDGEMENT

First and foremost I thank my teacher Mr. vijay kumar who has assigned me this term paper to bring out my creative

capabilities.

I express my gratitude to my parents for being a continuous source of encouragement

for all their financial aid.

I would like to acknowledge the assistance provided to me by the library staff of LOVELY

PROFESSIONAL UNIVERSITY.

My heartfelt gratitude to my class-mates and for helping me to complete my work in time.

Page 3: Fibonacci Heaps and Binary Heaps

contents

1. Introduction2. Heap applications3. Fibonacci heap4. Structure5. Implementation of operations6. Binary heap7. Heap operations8. Adding to the heap9. Deleting the root from the heap10. Building a heap11. Heap implementation

Page 4: Fibonacci Heaps and Binary Heaps

Introduction

In computer science, a heap is a specialized tree-based data structure that satisfies the heap property: if B is a child node of A, then key(A) ≥ key(B). This implies that an element with the greatest key is always in the root node, and so such a heap is sometimes called a max-heap. (Alternatively, if the comparison is reversed, the smallest element is always in the root node, which results in a min-heap.) The several variants of heaps are the prototypical most efficient implementations of the abstract data type priority queues. Priority queues are useful in many applications. In particular, heaps are crucial in several efficient graph algorithms.

The operations commonly performed with a heap are

delete-max or delete-min: removing the root node of a max- or min-heap, respectively

increase-key or decrease-key: updating a key within a max- or min-heap, respectively

insert: adding a new key to the heap

merge: joining two heaps to form a valid new heap containing all the elements of both.

Binary heap Binomial heap Fibonacci heapProcedure (worst-case) (worst-case) (amortized)--------------------------------------------------------------MAKE-HEAP (1) (1) (1)

INSERT (lg n) O(lg n) (1)MINIMUM (1) O(lg n) (l)EXTRACT-MIN (lg n) (1g n) O(lg n)UNION (n) O(lg n) (1)DECREASE-KEY (lg n) (lg n) (1)DELETE (1g n) (lg n) O(lg n)

Page 5: Fibonacci Heaps and Binary Heaps

In all cases, the Fibonacci heap performs worse compared to Binomial heap binary heap.

Even though the Fibonacci heap has a better O-notation bound, the small number of

DecreaseKey operations in practice explains why Dijkstra’s algorithm codes based on

binary heaps perform better than ones based on Fibonacci heaps (See [1]).

For any heap size the cost of extract-min is significantly higher in Fibonacci heap

implementation than in binary heap implementation. The cost of insert and decrease-key

are some what less for Fibonacci heap implementation, with the difference increasing

with the heap size. The number of decrease-key operation is not large enough to make

Fibonacci heaps perform better than binary heap.

Fibonacci Heap might out perform binary heaps in an application in which the

number of decrease-key operations is much higher than the number of extract-min

operations. The minimum cut algorithm of Nagamochi and Ibaraki provides such a

candidate application.

Heap applications

The heap data structure has many applications.

Heapsort : One of the best sorting methods being in-place and with no quadratic worst-case scenarios.

Selection algorithms : Finding the min, max, both the min and max, median, or even the k-th largest element can be done in linear time using heaps.

Graph algorithms : By using heaps as internal traversal data structures, run time will be reduced by an order of polynomial. Examples of such problems are Prim's minimal spanning tree algorithm and Dijkstra's shortest path problem.

Full and almost full binary heaps may be represented in a very space-efficient way using an array alone. The first (or last) element will contain the root. The next two elements of the array contain its children. The next four contain the four children of the two child nodes, etc. Thus the children of the node at position n would be at positions 2n and 2n+1 in a one-based array, or 2n+1 and 2n+2 in a zero-based array. This allows moving up or down the tree by doing simple index

Page 6: Fibonacci Heaps and Binary Heaps

computations. Balancing a heap is done by swapping elements which are out of order. As we can build a heap from an array without requiring extra memory (for the nodes, for example), heapsort can be used to sort an array in-place.

One more advantage of heaps over trees in some applications is that construction of heaps can be done in linear time using Tarjan's algorithm.

Fibonacci heap-:

In computer science, a Fibonacci heap is a heap data structure consisting of a forest of trees. It has a better amortized running time than a binomial heap. Fibonacci heaps were developed by Michael L. Fredman and Robert E. Tarjan in 1984 and first published in a scientific journal in 1987. The name of Fibonacci heap comes from Fibonacci numbers which are used in the running time analysis.

Operations insert, find minimum, decrease key, and merge (union) work in constant amortized time. Operations delete and delete minimum work in O(log n) amortized time. This means that starting from an empty data structure, any sequence of a operations from the first group and b operations from the second group would take O(a + b log n) time. In a binomial heap such a sequence of operations would take O((a + b)log (n)) time. A Fibonacci heap is thus better than a binomial heap when b is asymptotically smaller than a.

Using Fibonacci heaps for priority queues improves the asymptotic running time of important algorithms, such as Dijkstra's algorithm for computing shortest paths in a graph, and Prim's algorithm for computing a minimum spanning tree of a graph.

Structure-:

Example of a Fibonacci heap. It has three trees of degrees 0, 1 and 3. Three vertices are marked (shown in blue). Therefore the potential of the heap is 9.

Page 7: Fibonacci Heaps and Binary Heaps

A Fibonacci heap is a collection of trees satisfying the minimum-heap property, that is, the key of a child is always greater than or equal to the key of the parent. This implies that the minimum key is always at the root of one of the trees. Compared with binomial heaps, the structure of a Fibonacci heap is more flexible. The trees do not have a prescribed shape and in the extreme case the heap can have every element in a separate tree or a single tree of depth n. This flexibility allows some operations to be executed in a "lazy" manner, postponing the work for later operations. For example merging heaps is done simply by concatenating the two lists of trees, and operation decrease key sometimes cuts a node from its parent and forms a new tree.

However at some point some order needs to be introduced to the heap to achieve the desired running time. In particular, degrees of nodes (here degree means the number of children) are kept quite low: every node has degree at most O(log n) and the size of a subtree rooted in a node of degree k is at least Fk + 2, where Fk is the kth Fibonacci number. This is achieved by the rule that we can cut at most one child of each non-root node. When a second child is cut, the node itself needs to be cut from its parent and becomes the root of a new tree (see Proof of degree bounds, below). The number of trees is decreased in the operation delete minimum, where trees are linked together.

As a result of a relaxed structure, some operations can take a long time while others are done very quickly. In the amortized running time analysis we pretend that very fast operations take a little bit longer than they actually do. This additional time is then later subtracted from the actual running time of slow operations. The amount of time saved for later use is measured at any given moment by a potential function. The potential of a Fibonacci heap is given by

Potential = t + 2m

where t is the number of trees in the Fibonacci heap, and m is the number of marked nodes. A node is marked if at least one of its children was cut since this node was made a child of another node (all roots are unmarked).

Thus, the root of each tree in a heap has one unit of time stored. This unit of time can be used later to link this tree with another tree at amortized time 0. Also, each marked node has two units of time stored. One can be used to cut the node from its parent. If this happens, the node becomes a root and the second unit of time will remain stored in it as in any other root.

Implementation of operations

To allow fast deletion and concatenation, the roots of all trees are linked using a circular, doubly linked list. The children of each node are also linked using such a list. For each node, we maintain its number of children and whether the node is marked. Moreover we maintain a pointer to the root containing the minimum key.

Operation find minimum is now trivial because we keep the pointer to the node containing it. It does not change the potential of the heap, therefore both actual and amortized cost is constant. As mentioned above, merge is implemented simply by concatenating the lists of tree roots of the two heaps. This can be done in constant time and the potential does not change, leading again to

Page 8: Fibonacci Heaps and Binary Heaps

constant amortized time. Operation insert works by creating a new heap with one element and doing merge. This takes constant time, and the potential increases by one, because the number of trees increases. The amortized cost is thus still constant.

Fibonacci heap after first phase of extract minimum. Node with key 1 (the minimum) was deleted and its children were added as separate trees.

Operation extract minimum (same as delete minimum) operates in three phases. First we take the root containing the minimum element and remove it. Its children will become roots of new trees. If the number of children was d, it takes time O(d) to process all new roots and the potential increases by d-1. Therefore the amortized running time of this phase is O(d) = O(log n).

Fibonacci heap, after extract minimum is completed. First, nodes 3 and 6 are linked together. Then the result is linked with tree rooted at node 2. Finally, the new minimum is found.

However to complete the extract minimum operation, we need to update the pointer to the root with minimum key. Unfortunately there may be up to n roots we need to check. In the second phase we therefore decrease the number of roots by successively linking together roots of the same degree. When two roots u and v have the same degree, we make one of them a child of the other so that the one with smaller key remains the root. Its degree will increase by one. This is repeated until every root has a different degree. To find trees of the same degree efficiently we use an array of length O(log n) in which we keep a pointer to one root of each degree. When a second root is found of the same degree, the two are linked and the array is updated. The actual running time is O(log n + m) where m is the number of roots at the beginning of the second phase. At the end we will have at most O(log n) roots (because each has a different degree). Therefore the potential decreases by at least m-O(log n) and the amortized running time is O(log n).

In the third phase we check each of the remaining roots and find the minimum. This takes O(log n) time and the potential does not change. The overall amortized running time of extract minimum is therefore O(log n).

Page 9: Fibonacci Heaps and Binary Heaps

Fibonacci heap after decreasing key of node 9 to 0. This node as well as its two marked ancestors are cut from the tree rooted at 1 and placed as new roots.

Operation decrease key will take the node, decrease the key and if the heap property becomes violated (the new key is smaller than the key of the parent), the node is cut from its parent. If the parent is not a root, it is marked. If it has been marked already, it is cut as well and its parent is marked. We continue upwards until we reach either the root or an unmarked node. In the process we create some number, say k, of new trees. Each of these new trees except possibly the first one was marked originally but as a root it will become unmarked. One node can become marked. Therefore the potential decreases by at least k − 2. The actual time to perform the cutting was O(k), therefore the amortized running time is constant.

Finally, operation delete can be implemented simply by decreasing the key of the element to be deleted to minus infinity, thus turning it into the minimum of the whole heap. Then we call extract minimum to remove it. The amortized running time of this operation is O(log n).

Binary heap

Example of a complete binary max heap

Example of a complete binary min heap

Page 10: Fibonacci Heaps and Binary Heaps

A binary heap is a heap data structure created using a binary tree. It can be seen as a binary tree with two additional constraints:

The shape property: the tree is a complete binary tree; that is, all levels of the tree, except possibly the last one (deepest) are fully filled, and, if the last level of the tree is not complete, the nodes of that level are filled from left to right.

The heap property: each node is greater than or equal to each of its children according to some comparison predicate which is fixed for the entire data structure.

"Greater than or equal to" means according to whatever comparison function is chosen to sort the heap, not necessarily "greater than or equal to" in the mathematical sense (since the quantities are not always numerical). Heaps where the comparison function is mathematical "greater than or equal to" are called max-heaps; those where the comparison function is mathematical "less than" are called "min-heaps". Conventionally, min-heaps are used, since they are readily applicable for use in priority queues.

Note that the ordering of siblings in a heap is not specified by the heap property, so the two children of a parent can be freely interchanged, as long as this does not violate the shape and heap properties (compare with treap).

The binary heap is a special case of the d-ary heap in which d = 2.

It is possible to modify the heap structure to allow extraction of both the smallest and largest element in O(logn) time.[1] To do this the rows alternate between min heap and max heap. The algorithms are roughly the same, but in each step must consider the alternating rows with alternating comparisons. The performance is roughly the same as a normal single direction heap. This idea can be generalised to a min-max-median heap.

Heap operations

Adding to the heap

If we have a heap, and we add an element, we can perform an operation known as up-heap, bubble-up, percolate-up, sift-up, or heapify-up in order to restore the heap property. We can do this in O(log n) time, using a binary heap, by following this algorithm:

1. Add the element on the bottom level of the heap.2. Compare the added element with its parent; if they are in the correct order, stop.

3. If not, swap the element with its parent and return to the previous step.

Page 11: Fibonacci Heaps and Binary Heaps

We do this at maximum for each level in the tree—the height of the tree, which is O(log n). However, since approximately 50% of the elements are leaves and 75% are in the bottom two levels, it is likely that the new element to be inserted will only move a few levels upwards to maintain the heap. Thus, binary heaps support insertion in average constant time, O(1).

Say we have a max-heap

and we want to add the number 15 to the heap. We first place the 15 in the position marked by the X. However the heap property is violated since 15 is greater than 8, so we need to swap the 15 and the 8. So, we have the heap looking as follows after the first swap:

However the heap property is still violated since 15 is greater than 11, so we need to swap again:

which is a valid max-heap.

Deleting the root from the heap

The procedure for deleting the root from the heap—effectively extracting the maximum element in a max-heap or the minimum element in a min-heap—starts by replacing it with the last element on the last level. So, if we have the same max-heap as before, we remove the 11 and replace it with the 4.

Page 12: Fibonacci Heaps and Binary Heaps

Now the heap property is violated since 8 is greater than 4. The operation that restores the property is called down-heap, bubble-down, percolate-down, sift-down, or heapify-down. In this case, swapping the two elements 4 and 8, is enough to restore the heap property and we need not swap elements further:

The downward-moving node is swapped with the larger of its children in a max-heap (in a min-heap it would be swapped with its smaller child), until it satisfies the heap property in its new position. This functionality is achieved by the Max-Heapify function as defined below in pseudocode for an array-backed heap A.

Max-Heapify[2](A, i): left ← 2i right ← 2i + 1 if left ≤ heap-length[A] and A[left] > A[i] then: largest ← left else: largest ← i if right ≤ heap-length[A] and A[right] > A[largest] then: largest ← right if largest ≠ i then: swap A[i] ↔ A[largest] Max-Heapify(A, largest)

Note that the down-heap operation (without the preceding swap) can be used in general to modify the value of the root, even when an element is not being deleted.

Building a heap

A heap could be built by successive insertions. This approach requires O(nlgn) time because each insertion takes O(lgn) time and there are n elements ('lg()' denotes a binary logarithm here.) However this is not the optimal method. The optimal method starts by arbitrarily putting the elements on a binary tree (which could be represented by an array, see below). Then starting from the lowest level and moving upwards until the heap property is restored by shifting the root of the subtree downward as in the deletion algorithm. More specifically if all the subtrees starting at some height h (measured from the bottom) have already been "heapified", the trees at height h + 1 can be heapified by sending their root down (along the path of maximum children when building a max-heap, or minimum children when building a min-heap). This process takes O(h) operations (swaps). In this method most of the heapification takes place in the lower levels. The

number of nodes at height h is . Therefore, the cost of heapifying all subtrees is:

Page 13: Fibonacci Heaps and Binary Heaps

This uses the fact that the given infinite series h / 2h converges to 2.

The Build-Max-Heap function that follows, converts an array A which stores a complete binary tree with n nodes to a max-heap by repeatedly using Max-Heapify in a bottom up manner. It is based on the observation that the array elements indexed by floor(n/2) + 1, floor(n/2) + 2, ... , n are all leaves for the tree, thus each is an one-element heap. Build-Max-Heap runs Max-Heapify on each of the remaining tree nodes.

Build-Max-Heap[2](A): heap-length[A] ← length[A] for i ← floor(length[A]/2) downto 1 do Max-Heapify(A, i)

Heap implementation

It is perfectly acceptable to use a traditional binary tree data structure to implement a binary heap. There is an issue with finding the adjacent element on the last level on the binary heap when adding an element which can be resolved algorithmically or by adding extra data to the nodes, called "threading" the tree—that is, instead of merely storing references to the children, we store the inorder successor of the node as well.

A small complete binary tree stored in an array

However, a more common approach, and an approach aligned with the theory behind heaps, is to store the heap in an array. Any binary tree can be stored in an array, but because a heap is always an almost complete binary tree, it can be stored compactly. No space is required for pointers; instead, the parent and children of each node can be found by simple arithmetic on array indices. Details depend on the root position (which in turn may depend on constraints of a programming language used for implementation). If the tree root item has index 0 (n tree elements are a[0] .. a[n−1]), then for each index i, element a[i] has children a[2i+1] and a[2i+2], and the parent a[floor((i−1)/2)], as shown in the figure. If the root is a[1] (tree elements are a[1] .. a[n]), then for

Page 14: Fibonacci Heaps and Binary Heaps

each index i, element a[i] has children a[2i] and a[2i+1], and the parent a[floor(i/2)]. This is a simple example of an implicit data structure or Ahnentafel list.

This approach is particularly useful in the heapsort algorithm, where it allows the space in the input array to be reused to store the heap (i.e. the algorithm is in-place). However it requires allocating the array before filling it, which makes this method not that useful in priority queues implementation, where the number of tasks (heap elements) is not necessarily known in advance.

The upheap/downheap operations can then be stated in terms of an array as follows: suppose that the heap property holds for the indices b, b+1, ..., e. The sift-down function extends the heap property to b−1, b, b+1, ..., e. Only index i = b−1 can violate the heap property. Let j be the index of the largest child of a[i] (for a max-heap, or the smallest child for a min-heap) within the range b, ..., e. (If no such index exists because 2i > e then the heap property holds for the newly extended range and nothing needs to be done.) By swapping the values a[i] and a[j] the heap property for position i is established. At this point, the only problem is that the heap property might not hold for index j. The sift-down function is applied tail-recursively to index j until the heap property is established for all elements.

The sift-down function is fast. In each step it only needs two comparisons and one swap. The index value where it is working doubles in each iteration, so that at most log2 e steps are required.

The operation of merging two binary heaps takes Θ(n) for equal-sized heaps. The best you can do is (in case of array implementation) simply concatenating the two heap arrays and build a heap of the result. When merging is a common task, a different heap implementation is recommended, such as binomial heaps, which can be merged in O(log n).

refrencesen.wikipedia.org/wiki/Heap_(data_structure)

www.leekillough.com/heaps

www.java-tips.org/.../priority-queue-binary-heap-implementation-in-3.htmlstackoverflow.com/.../has-anyone-actually-implemented-a-fibonacci-heap-efficiently

www.cs.princeton.edu/~wayne/teaching/fibonacci-heap.pdf

www.itl.nist.gov/div897/sqg/dads/HTML/binaryheap.html

www.springerlink.com/index/y68v5093712w362t.pdf

www.docstoc.com/.../Binomial-Fibonacci-Heaps-and-Amortized-Analysis

www.freebase.com/view/en/fibonacci_heap