Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Binary search
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Binary search versus other schemes == Sorted arrays with binary search are a very inefficient solution when insertion and deletion operations are interleaved with retrieval, taking <math display="inline">O(n)</math> time for each such operation. In addition, sorted arrays can complicate memory use especially when elements are often inserted into the array.{{Sfn|Knuth|1997|loc=§2.2.2 ("Sequential Allocation")}} There are other data structures that support much more efficient insertion and deletion. Binary search can be used to perform exact matching and [[Set (abstract data type)|set membership]] (determining whether a target value is in a collection of values). There are data structures that support faster exact matching and set membership. However, unlike many other searching schemes, binary search can be used for efficient approximate matching, usually performing such matches in <math display="inline">O(\log n)</math> time regardless of the type or structure of the values themselves.<ref name="pred">{{cite journal|last1=Beame|first1=Paul|last2=Fich|first2=Faith E.|author-link2=Faith Ellen|title=Optimal bounds for the predecessor problem and related problems|journal=[[Journal of Computer and System Sciences]]|date=2001|volume=65|issue=1|pages=38–72|doi=10.1006/jcss.2002.1822|doi-access=free}}</ref> In addition, there are some operations, like finding the smallest and largest element, that can be performed efficiently on a sorted array.{{sfn|Sedgewick|Wayne|2011|loc=§3.1, subsection "Rank and selection"}} === Linear search === [[Linear search]] is a simple search algorithm that checks every record until it finds the target value. Linear search can be done on a linked list, which allows for faster insertion and deletion than an array. Binary search is faster than linear search for sorted arrays except if the array is short, although the array needs to be sorted beforehand.{{Efn|{{Harvnb|Knuth|1998}} performed a formal time performance analysis of both of these search algorithms. On Knuth's [[MIX (abstract machine)|MIX]] computer, which Knuth designed as a representation of an ordinary computer, binary search takes on average <math display="inline">18 \log n - 16</math> units of time for a successful search, while linear search with a [[sentinel node]] at the end of the list takes <math display="inline">1.75n + 8.5 - \frac{n \text{ mod } 2}{4n}</math> units. Linear search has lower initial complexity because it requires minimal computation, but it quickly outgrows binary search in complexity. On the MIX computer, binary search only outperforms linear search with a sentinel if <math display="inline">n > 44</math>.{{Sfn|Knuth|1998|loc=§6.2.1 ("Searching an ordered table"), subsection "Further analysis of binary search"}}{{Sfn|Knuth|1998|loc=Answers to Exercises (§6.2.1) for "Exercise 5"}}}}{{Sfn|Knuth|1998|loc=§6.2.1 ("Searching an ordered table")}} All [[sorting algorithm]]s based on comparing elements, such as [[quicksort]] and [[merge sort]], require at least <math display="inline">O(n \log n)</math> comparisons in the worst case.{{Sfn|Knuth|1998|loc=§5.3.1 ("Minimum-Comparison sorting")}} Unlike linear search, binary search can be used for efficient approximate matching. There are operations such as finding the smallest and largest element that can be done efficiently on a sorted array but not on an unsorted array.{{Sfn|Sedgewick|Wayne|2011|loc=§3.2 ("Ordered symbol tables")}} === Trees === [[File:Binary search tree search 4.svg|thumb|right|[[Binary search tree]]s are searched using an algorithm similar to binary search.]] A [[binary search tree]] is a [[binary tree]] data structure that works based on the principle of binary search. The records of the tree are arranged in sorted order, and each record in the tree can be searched using an algorithm similar to binary search, taking on average logarithmic time. Insertion and deletion also require on average logarithmic time in binary search trees. This can be faster than the linear time insertion and deletion of sorted arrays, and binary trees retain the ability to perform all the operations possible on a sorted array, including range and approximate queries.<ref name="pred" />{{Sfn|Sedgewick|Wayne|2011|loc=§3.2 ("Binary Search Trees"), subsection "Order-based methods and deletion"}} However, binary search is usually more efficient for searching as binary search trees will most likely be imperfectly balanced, resulting in slightly worse performance than binary search. This even applies to [[self-balancing binary search tree|balanced binary search tree]]s, binary search trees that balance their own nodes, because they rarely produce the tree with the fewest possible levels. Except for balanced binary search trees, the tree may be severely imbalanced with few internal nodes with two children, resulting in the average and worst-case search time approaching <math display="inline">n</math> comparisons.{{Efn|Inserting the values in sorted order or in an alternating lowest-highest key pattern will result in a binary search tree that maximizes the average and worst-case search time.{{Sfn|Knuth|1998|loc=§6.2.2 ("Binary tree searching"), subsection "But what about the worst case?"}}}} Binary search trees take more space than sorted arrays.{{Sfn|Sedgewick|Wayne|2011|loc=§3.5 ("Applications"), "Which symbol-table implementation should I use?"}} Binary search trees lend themselves to fast searching in external memory stored in hard disks, as binary search trees can be efficiently structured in filesystems. The [[B-tree]] generalizes this method of tree organization. B-trees are frequently used to organize long-term storage such as [[database]]s and [[filesystem]]s.{{Sfn|Knuth|1998|loc=§5.4.9 ("Disks and Drums")}}{{Sfn|Knuth|1998|loc=§6.2.4 ("Multiway trees")}} === Hashing === For implementing [[associative arrays]], [[hash table]]s, a data structure that maps keys to [[record (computer science)|records]] using a [[hash function]], are generally faster than binary search on a sorted array of records.{{Sfn|Knuth|1998|loc=§6.4 ("Hashing")}} Most hash table implementations require only [[Amortized analysis|amortized]] constant time on average.{{efn|It is possible to search some hash table implementations in guaranteed constant time.{{Sfn|Knuth|1998|loc=§6.4 ("Hashing"), subsection "History"}}}}<ref>{{cite journal|last1=Dietzfelbinger|first1=Martin|last2=Karlin|first2=Anna|last3=Mehlhorn|first3=Kurt|last4=Meyer auf der Heide|first4=Friedhelm|last5=Rohnert|first5=Hans|last6=Tarjan|first6=Robert E.|author-link2=Anna Karlin|author-link3=Kurt Mehlhorn|author-link6=Robert Tarjan|title=Dynamic perfect hashing: upper and lower bounds|journal=[[SIAM Journal on Computing]]|date=August 1994|volume=23|issue=4|pages=738–761|doi=10.1137/S0097539791194094}}</ref> However, hashing is not useful for approximate matches, such as computing the next-smallest, next-largest, and nearest key, as the only information given on a failed search is that the target is not present in any record.<ref>{{cite web|last1=Morin|first1=Pat|title=Hash tables|url=http://cglab.ca/~morin/teaching/5408/notes/hashing.pdf |archive-url=https://ghostarchive.org/archive/20221009/http://cglab.ca/~morin/teaching/5408/notes/hashing.pdf |archive-date=2022-10-09 |url-status=live|access-date=28 March 2016|page=1}}</ref> Binary search is ideal for such matches, performing them in logarithmic time. Binary search also supports approximate matches. Some operations, like finding the smallest and largest element, can be done efficiently on sorted arrays but not on hash tables.<ref name="pred" /> === Set membership algorithms === A related problem to search is [[Set (abstract data type)|set membership]]. Any algorithm that does lookup, like binary search, can also be used for set membership. There are other algorithms that are more specifically suited for set membership. A [[bit array]] is the simplest, useful when the range of keys is limited. It compactly stores a collection of [[bit]]s, with each bit representing a single key within the range of keys. Bit arrays are very fast, requiring only <math display="inline">O(1)</math> time.{{Sfn|Knuth|2011|loc=§7.1.3 ("Bitwise Tricks and Techniques")}} The Judy1 type of [[Judy array]] handles 64-bit keys efficiently.<ref name="judyarray">{{Citation|last1=Silverstein|first1=Alan|title=Judy IV shop manual|publisher=[[Hewlett-Packard]]|url=http://judy.sourceforge.net/doc/shop_interm.pdf |archive-url=https://ghostarchive.org/archive/20221009/http://judy.sourceforge.net/doc/shop_interm.pdf |archive-date=2022-10-09 |url-status=live|pages=80–81}}</ref> For approximate results, [[Bloom filter]]s, another probabilistic data structure based on hashing, store a [[set (mathematics)|set]] of keys by encoding the keys using a [[bit array]] and multiple hash functions. Bloom filters are much more space-efficient than bit arrays in most cases and not much slower: with <math display="inline">k</math> hash functions, membership queries require only <math display="inline">O(k)</math> time. However, Bloom filters suffer from [[False positives and false negatives|false positives]].{{Efn|This is because simply setting all of the bits which the hash functions point to for a specific key can affect queries for other keys which have a common hash location for one or more of the functions.<ref name="cuckoofilter">{{cite conference|last1=Fan|first1=Bin|last2=Andersen|first2=Dave G.|last3=Kaminsky|first3=Michael|last4=Mitzenmacher|first4=Michael D.|title=Cuckoo filter: practically better than Bloom|conference=Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies|date=2014|pages=75–88|doi=10.1145/2674005.2674994|doi-access=free}}</ref>}}{{Efn|There exist improvements of the Bloom filter which improve on its complexity or support deletion; for example, the cuckoo filter exploits [[cuckoo hashing]] to gain these advantages.<ref name="cuckoofilter" />}}<ref>{{cite journal|last1=Bloom|first1=Burton H.|s2cid=7931252|title=Space/time trade-offs in hash coding with allowable errors|journal=[[Communications of the ACM]]|date=1970|volume=13|issue=7|pages=422–426|doi=10.1145/362686.362692|df=dmy-all|citeseerx=10.1.1.641.9096}}</ref> === Other data structures === There exist data structures that may improve on binary search in some cases for both searching and other operations available for sorted arrays. For example, searches, approximate matches, and the operations available to sorted arrays can be performed more efficiently than binary search on specialized data structures such as [[van Emde Boas tree]]s, [[fusion tree]]s, [[trie]]s, and [[bit array]]s. These specialized data structures are usually only faster because they take advantage of the properties of keys with a certain attribute (usually keys that are small integers), and thus will be time or space consuming for keys that lack that attribute.<ref name="pred" /> As long as the keys can be ordered, these operations can always be done at least efficiently on a sorted array regardless of the keys. Some structures, such as Judy arrays, use a combination of approaches to mitigate this while retaining efficiency and the ability to perform approximate matching.<ref name="judyarray" />
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Binary search
(section)
Add topic