Heap (data structure) - Reference.org

On this page

Heap (data structure)

Computer science data structure

Science Mathematics Theoretical computer science Data structures and types

In computer science, a heap is a tree-based data structure that satisfies the heap property: in a max heap, each parent node's key is greater than or equal to its children's keys, while in a min heap, it is less than or equal. The heap serves as an efficient implementation of the abstract data type known as a priority queue, where the highest or lowest priority element is at the root. A common variant is the binary heap, a complete binary tree introduced by J. W. J. Williams in 1964 for the heapsort algorithm. Heaps are useful in algorithms like Dijkstra's algorithm and are typically stored in-place within arrays, requiring no extra memory beyond the keys themselves.

Related Image Collections Add Image

Profiles

1 Image

We don't have any YouTube videos related to Heap (data structure) yet.

You can add one yourself here.

We don't have any PDF documents related to Heap (data structure) yet.

You can add one yourself here.

We don't have any Books related to Heap (data structure) yet.

You can add one yourself here.

We don't have any archived web articles related to Heap (data structure) yet.

You can submit a link to a page to archive here.

Operations

The common operations involving heaps are:

Basic

find-max (or find-min): find a maximum item of a max-heap, or a minimum item of a min-heap, respectively (a.k.a. peek)
insert: adding a new key to the heap (a.k.a., push⁴)
extract-max (or extract-min): returns the node of maximum value from a max heap [or minimum value from a min heap] after removing it from the heap (a.k.a., pop⁵)
delete-max (or delete-min): removing the root node of a max heap (or min heap), respectively
replace: pop root and push a new key. This is more efficient than a pop followed by a push, since it only needs to balance once, not twice, and is appropriate for fixed-size heaps.⁶

Creation

create-heap: create an empty heap
heapify: create a heap out of given array of elements
merge (union): joining two heaps to form a valid new heap containing all the elements of both, preserving the original heaps.
meld: joining two heaps to form a valid new heap containing all the elements of both, destroying the original heaps.

Inspection

size: return the number of items in the heap.
is-empty: return true if the heap is empty, false otherwise.

Internal

increase-key or decrease-key: updating a key within a max- or min-heap, respectively
delete: delete an arbitrary node (followed by moving last node and sifting to maintain heap)
sift-up: move a node up in the tree, as long as needed; used to restore heap condition after insertion. Called "sift" because node moves up the tree until it reaches the correct level, as in a sieve.
sift-down: move a node down in the tree, similar to sift-up; used to restore heap condition after deletion or replacement.

Implementation using arrays

Heaps are usually implemented with an array, as follows:

Each element in the array represents a node of the heap, and
The parent / child relationship is defined implicitly by the elements' indices in the array.

For a binary heap, in the array, the first index contains the root element. The next two indices of the array contain the root's children. The next four indices contain the four children of the root's two child nodes, and so on. Therefore, given a node at index i, its children are at indices ⁠ 2 i + 1 {\displaystyle 2i+1} ⁠ and ⁠ 2 i + 2 {\displaystyle 2i+2} ⁠, and its parent is at index ⌊(i−1)/2⌋. This simple indexing scheme makes it efficient to move "up" or "down" the tree.

Balancing a heap is done by sift-up or sift-down operations (swapping elements which are out of order). As we can build a heap from an array without requiring extra memory (for the nodes, for example), heapsort can be used to sort an array in-place.

After an element is inserted into or deleted from a heap, the heap property may be violated, and the heap must be re-balanced by swapping elements within the array.

Although different types of heaps implement the operations differently, the most common way is as follows:

Insertion: Add the new element at the end of the heap, in the first available free space. If this will violate the heap property, sift up the new element (swim operation) until the heap property has been reestablished.
Extraction: Remove the root and insert the last element of the heap in the root. If this will violate the heap property, sift down the new root (sink operation) to reestablish the heap property.
Replacement: Remove the root and put the new element in the root and sift down. When compared to extraction followed by insertion, this avoids a sift up step.

Construction of a binary (or d-ary) heap out of a given array of elements may be performed in linear time using the classic Floyd algorithm, with the worst-case number of comparisons equal to 2N − 2s2(N) − e2(N) (for a binary heap), where s2(N) is the sum of all digits of the binary representation of N and e2(N) is the exponent of 2 in the prime factorization of N.⁷ This is faster than a sequence of consecutive insertions into an originally empty heap, which is log-linear.⁸

Variants

Comparison of theoretic bounds for variants

Here are time complexities ⁹ of various heap data structures. The abbreviation am. indicates that the given complexity is amortized, otherwise it is a worst-case complexity. For the meaning of "O(f)" and "Θ(f)" see Big O notation. Names of operations assume a max-heap.

Operation	find-max	delete-max	increase-key	insert	meld	make-heap¹⁰
Binary ¹¹	Θ(1)	Θ(log n)	Θ(log n)	Θ(log n)	Θ(n)	Θ(n)
Skew ¹²	Θ(1)	O(log n) am.	O(log n) am.	O(log n) am.	O(log n) am.	Θ(n) am.
Leftist ¹³	Θ(1)	Θ(log n)	Θ(log n)	Θ(log n)	Θ(log n)	Θ(n)
Binomial ¹⁴ ¹⁵	Θ(1)	Θ(log n)	Θ(log n)	Θ(1) am.	Θ(log n)¹⁶	Θ(n)
Skew binomial ¹⁷	Θ(1)	Θ(log n)	Θ(log n)	Θ(1)	Θ(log n)¹⁸	Θ(n)
2–3 heap ¹⁹	Θ(1)	O(log n) am.	Θ(1)	Θ(1) am.	O(log n)²⁰	Θ(n)
Bottom-up skew ²¹	Θ(1)	O(log n) am.	O(log n) am.	Θ(1) am.	Θ(1) am.	Θ(n) am.
Pairing ²²	Θ(1)	O(log n) am.	o(log n) am.²³	Θ(1)	Θ(1)	Θ(n)
Rank-pairing²⁴	Θ(1)	O(log n) am.	Θ(1) am.	Θ(1)	Θ(1)	Θ(n)
Fibonacci ²⁵ ²⁶	Θ(1)	O(log n) am.	Θ(1) am.	Θ(1)	Θ(1)	Θ(n)
Strict Fibonacci ²⁷ ²⁸	Θ(1)	Θ(log n)	Θ(1)	Θ(1)	Θ(1)	Θ(n)
Brodal ²⁹ ³⁰	Θ(1)	Θ(log n)	Θ(1)	Θ(1)	Θ(1)	Θ(n)³¹

Applications

The heap data structure has many applications.

Heapsort: One of the best sorting methods being in-place and with no quadratic worst-case scenarios.
Selection algorithms: A heap allows access to the min or max element in constant time, and other selections (such as median or kth-element) can be done in sub-linear time on data that is in a heap.³²
Graph algorithms: By using heaps as internal traversal data structures, run time will be reduced by polynomial order. Examples of such problems are Prim's minimal-spanning-tree algorithm and Dijkstra's shortest-path algorithm.
Priority queue: A priority queue is an abstract concept like "a list" or "a map"; just as a list can be implemented with a linked list or an array, a priority queue can be implemented with a heap or a variety of other methods.
K-way merge: A heap data structure is useful to merge many already-sorted input streams into a single sorted output stream. Examples of the need for merging include external sorting and streaming results from distributed data such as a log structured merge tree. The inner loop is obtaining the min element, replacing with the next element for the corresponding input stream, then doing a sift-down heap operation. (Alternatively the replace function.) (Using extract-max and insert functions of a priority queue are much less efficient.)

Programming language implementations

The C++ Standard Library provides the make_heap, push_heap and pop_heap algorithms for heaps (usually implemented as binary heaps), which operate on arbitrary random access iterators. It treats the iterators as a reference to an array, and uses the array-to-heap conversion. It also provides the container adaptor priority_queue, which wraps these facilities in a container-like class. However, there is no standard support for the replace, sift-up/sift-down, or decrease/increase-key operations.
The Boost C++ libraries include a heaps library. Unlike the STL, it supports decrease and increase operations, and supports additional types of heap: specifically, it supports d-ary, binomial, Fibonacci, pairing and skew heaps.
There is a generic heap implementation for C and C++ with D-ary heap and B-heap support. It provides an STL-like API.
The standard library of the D programming language includes std.container.BinaryHeap, which is implemented in terms of D's ranges. Instances can be constructed from any random-access range. BinaryHeap exposes an input range interface that allows iteration with D's built-in foreach statements and integration with the range-based API of the std.algorithm package.
For Haskell there is the Data.Heap module.
The Java platform (since version 1.5) provides a binary heap implementation with the class java.util.PriorityQueue in the Java Collections Framework. This class implements by default a min-heap; to implement a max-heap, programmer should write a custom comparator. There is no support for the replace, sift-up/sift-down, or decrease/increase-key operations.
Python has a heapq module that implements a priority queue using a binary heap. The library exposes a heapreplace function to support k-way merging. Python only supports a min-heap implementation.
PHP has both max-heap (SplMaxHeap) and min-heap (SplMinHeap) as of version 5.3 in the Standard PHP Library.
Perl has implementations of binary, binomial, and Fibonacci heaps in the Heap distribution available on CPAN.
The Go language contains a heap package with heap algorithms that operate on an arbitrary type that satisfies a given interface. That package does not support the replace, sift-up/sift-down, or decrease/increase-key operations.
Apple's Core Foundation library contains a CFBinaryHeap structure.
Pharo has an implementation of a heap in the Collections-Sequenceable package along with a set of test cases. A heap is used in the implementation of the timer event loop.
The Rust programming language has a binary max-heap implementation, BinaryHeap, in the collections module of its standard library.
.NET has PriorityQueue class which uses quaternary (d-ary) min-heap implementation. It is available from .NET 6.

External links

Wikimedia Commons has media related to Heap data structures. The Wikibook Data Structures has a page on the topic of: Min and Max Heaps

Heap at Wolfram MathWorld
Explanation of how the basic heap algorithms work
Bentley, Jon Louis (2000). Programming Pearls (2nd ed.). Addison Wesley. pp. 147–162. ISBN 0201657880.

References

Black (ed.), Paul E. (2004-12-14). Entry for heap in Dictionary of Algorithms and Data Structures. Online version. U.S. National Institute of Standards and Technology, 14 December 2004. Retrieved on 2017-10-08 from https://xlinux.nist.gov/dads/HTML/heap.html. /wiki/Dictionary_of_Algorithms_and_Data_Structures ↩
CORMEN, THOMAS H. (2009). INTRODUCTION TO ALGORITHMS. United States of America: The MIT Press Cambridge, Massachusetts London, England. pp. 151–152. ISBN 978-0-262-03384-8. 978-0-262-03384-8 ↩
Williams, J. W. J. (1964), "Algorithm 232 - Heapsort", Communications of the ACM, 7 (6): 347–348, doi:10.1145/512274.512284 /wiki/J._W._J._Williams ↩
The Python Standard Library, 8.4. heapq — Heap queue algorithm, heapq.heappush https://docs.python.org/3/library/heapq.html#heapq.heappush ↩
The Python Standard Library, 8.4. heapq — Heap queue algorithm, heapq.heappop https://docs.python.org/3/library/heapq.html#heapq.heappop ↩
The Python Standard Library, 8.4. heapq — Heap queue algorithm, heapq.heapreplace https://docs.python.org/3/library/heapq.html#heapq.heapreplace ↩
Suchenek, Marek A. (2012), "Elementary Yet Precise Worst-Case Analysis of Floyd's Heap-Construction Program", Fundamenta Informaticae, 120 (1), IOS Press: 75–92, doi:10.3233/FI-2012-751. /wiki/Doi_(identifier) ↩
Each insertion takes O(log(k)) in the existing size of the heap, thus ∑ k = 1 n O ( log ⁡ k ) {\displaystyle \sum _{k=1}^{n}O(\log k)} . Since log ⁡ n / 2 = ( log ⁡ n ) − 1 {\displaystyle \log n/2=(\log n)-1} , a constant factor (half) of these insertions are within a constant factor of the maximum, so asymptotically we can assume k = n {\displaystyle k=n} ; formally the time is n O ( log ⁡ n ) − O ( n ) = O ( n log ⁡ n ) {\displaystyle nO(\log n)-O(n)=O(n\log n)} . This can also be readily seen from Stirling's approximation. /wiki/Stirling%27s_approximation ↩
Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (1990). Introduction to Algorithms (1st ed.). MIT Press and McGraw-Hill. ISBN 0-262-03141-8. 0-262-03141-8 ↩
make-heap is the operation of building a heap from a sequence of n unsorted elements. It can be done in Θ(n) time whenever meld runs in O(log n) time (where both complexities can be amortized).[9][10] Another algorithm achieves Θ(n) for binary heaps.[11] ↩
Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (1990). Introduction to Algorithms (1st ed.). MIT Press and McGraw-Hill. ISBN 0-262-03141-8. 0-262-03141-8 ↩
Sleator, Daniel Dominic; Tarjan, Robert Endre (February 1986). "Self-Adjusting Heaps". SIAM Journal on Computing. 15 (1): 52–69. CiteSeerX 10.1.1.93.6678. doi:10.1137/0215004. ISSN 0097-5397. /wiki/Daniel_Sleator ↩
Tarjan, Robert (1983). "3.3. Leftist heaps". Data Structures and Network Algorithms. pp. 38–42. doi:10.1137/1.9781611970265. ISBN 978-0-89871-187-5. 978-0-89871-187-5 ↩
Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (1990). Introduction to Algorithms (1st ed.). MIT Press and McGraw-Hill. ISBN 0-262-03141-8. 0-262-03141-8 ↩
"Binomial Heap | Brilliant Math & Science Wiki". brilliant.org. Retrieved 2019-09-30. https://brilliant.org/wiki/binomial-heap/ ↩
For persistent heaps (not supporting increase-key), a generic transformation reduces the cost of meld to that of insert, while the new cost of delete-max is the sum of the old costs of delete-max and meld.[14] Here, it makes meld run in Θ(1) time (amortized, if the cost of insert is) while delete-max still runs in O(log n). Applied to skew binomial heaps, it yields Brodal-Okasaki queues, persistent heaps with optimal worst-case complexities.[13] /wiki/Persistent_data_structure ↩
Brodal, Gerth Stølting; Okasaki, Chris (November 1996), "Optimal purely functional priority queues", Journal of Functional Programming, 6 (6): 839–857, doi:10.1017/s095679680000201x /wiki/Doi_(identifier) ↩
For persistent heaps (not supporting increase-key), a generic transformation reduces the cost of meld to that of insert, while the new cost of delete-max is the sum of the old costs of delete-max and meld.[14] Here, it makes meld run in Θ(1) time (amortized, if the cost of insert is) while delete-max still runs in O(log n). Applied to skew binomial heaps, it yields Brodal-Okasaki queues, persistent heaps with optimal worst-case complexities.[13] /wiki/Persistent_data_structure ↩
Takaoka, Tadao (1999), Theory of 2–3 Heaps (PDF), p. 12 /w/index.php?title=Tadao_Takaoka&action=edit&redlink=1 ↩
For persistent heaps (not supporting increase-key), a generic transformation reduces the cost of meld to that of insert, while the new cost of delete-max is the sum of the old costs of delete-max and meld.[14] Here, it makes meld run in Θ(1) time (amortized, if the cost of insert is) while delete-max still runs in O(log n). Applied to skew binomial heaps, it yields Brodal-Okasaki queues, persistent heaps with optimal worst-case complexities.[13] /wiki/Persistent_data_structure ↩
Sleator, Daniel Dominic; Tarjan, Robert Endre (February 1986). "Self-Adjusting Heaps". SIAM Journal on Computing. 15 (1): 52–69. CiteSeerX 10.1.1.93.6678. doi:10.1137/0215004. ISSN 0097-5397. /wiki/Daniel_Sleator ↩
Iacono, John (2000), "Improved upper bounds for pairing heaps", Proc. 7th Scandinavian Workshop on Algorithm Theory (PDF), Lecture Notes in Computer Science, vol. 1851, Springer-Verlag, pp. 63–77, arXiv:1110.4428, CiteSeerX 10.1.1.748.7812, doi:10.1007/3-540-44985-X_5, ISBN 3-540-67690-2 3-540-67690-2 ↩
Lower bound of Ω ( log ⁡ log ⁡ n ) , {\displaystyle \Omega (\log \log n),} [17] upper bound of O ( 2 2 log ⁡ log ⁡ n ) . {\displaystyle O(2^{2{\sqrt {\log \log n}}}).} [18] ↩
Haeupler, Bernhard; Sen, Siddhartha; Tarjan, Robert E. (November 2011). "Rank-pairing heaps" (PDF). SIAM J. Computing. 40 (6): 1463–1485. doi:10.1137/100785351. /wiki/Robert_Tarjan ↩
Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (1990). Introduction to Algorithms (1st ed.). MIT Press and McGraw-Hill. ISBN 0-262-03141-8. 0-262-03141-8 ↩
Fredman, Michael Lawrence; Tarjan, Robert E. (July 1987). "Fibonacci heaps and their uses in improved network optimization algorithms" (PDF). Journal of the Association for Computing Machinery. 34 (3): 596–615. CiteSeerX 10.1.1.309.8927. doi:10.1145/28869.28874. /wiki/Michael_Fredman ↩
Brodal, Gerth Stølting; Lagogiannis, George; Tarjan, Robert E. (2012). Strict Fibonacci heaps (PDF). Proceedings of the 44th symposium on Theory of Computing - STOC '12. pp. 1177–1184. CiteSeerX 10.1.1.233.1740. doi:10.1145/2213977.2214082. ISBN 978-1-4503-1245-5. 978-1-4503-1245-5 ↩
Brodal queues and strict Fibonacci heaps achieve optimal worst-case complexities for heaps. They were first described as imperative data structures. The Brodal-Okasaki queue is a persistent data structure achieving the same optimum, except that increase-key is not supported. /wiki/Persistent_data_structure ↩
Brodal, Gerth S. (1996), "Worst-Case Efficient Priority Queues" (PDF), Proc. 7th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 52–58 /wiki/Gerth_St%C3%B8lting_Brodal ↩
Brodal queues and strict Fibonacci heaps achieve optimal worst-case complexities for heaps. They were first described as imperative data structures. The Brodal-Okasaki queue is a persistent data structure achieving the same optimum, except that increase-key is not supported. /wiki/Persistent_data_structure ↩
Goodrich, Michael T.; Tamassia, Roberto (2004). "7.3.6. Bottom-Up Heap Construction". Data Structures and Algorithms in Java (3rd ed.). pp. 338–341. ISBN 0-471-46983-1. 0-471-46983-1 ↩
Frederickson, Greg N. (1993), "An Optimal Algorithm for Selection in a Min-Heap", Information and Computation (PDF), vol. 104, Academic Press, pp. 197–214, doi:10.1006/inco.1993.1030, archived from the original (PDF) on 2012-12-03, retrieved 2010-10-31 https://web.archive.org/web/20121203045606/http://ftp.cs.purdue.edu/research/technical_reports/1991/TR%2091-027.pdf ↩

Operations

Implementation using arrays

Variants

Comparison of theoretic bounds for variants

Applications

Programming language implementations

See also

External links

References