Randomized algorithms that solve the problem in linear time are known, in Euclidean spaces whose dimension is treated as a constant for the purposes of asymptotic analysis.234 This is significantly faster than the O ( n 2 ) {\displaystyle O(n^{2})} time (expressed here in big O notation) that would be obtained by a naive algorithm of finding distances between all pairs of points and selecting the smallest.
It is also possible to solve the problem without randomization, in random-access machine models of computation with unlimited memory that allow the use of the floor function, in near-linear O ( n log log n ) {\displaystyle O(n\log \log n)} time.5 In even more restricted models of computation, such as the algebraic decision tree, the problem can be solved in the somewhat slower O ( n log n ) {\displaystyle O(n\log n)} time bound,6 and this is optimal for this model, by a reduction from the element uniqueness problem. Both sweep line algorithms and divide-and-conquer algorithms with this slower time bound are commonly taught as examples of these algorithm design techniques.78
A linear expected time randomized algorithm of Rabin (1976), modified slightly by Richard Lipton to make its analysis easier, proceeds as follows, on an input set S {\displaystyle S} consisting of n {\displaystyle n} points in a k {\displaystyle k} -dimensional Euclidean space:
The algorithm will always correctly determine the closest pair, because it maps any pair closer than distance d {\displaystyle d} to the same grid point or to adjacent grid points. The uniform sampling of pairs in the first step of the algorithm (compared to a different method of Rabin for sampling a similar number of pairs) simplifies the proof that the expected number of distances computed by the algorithm is linear.9
Instead, a different algorithm Khuller & Matias (1995) goes through two phases: a random iterated filtering process that approximates the closest distance to within an approximation ratio of 2 k {\displaystyle 2{\sqrt {k}}} , together with a finishing step that turns this approximate distance into the exact closest distance. The filtering process repeat the following steps, until S {\displaystyle S} becomes empty:
The approximate distance found by this filtering process is the final value of d {\displaystyle d} , computed in the step before S {\displaystyle S} becomes empty. Each step removes all points whose closest neighbor is at distance d {\displaystyle d} or greater, at least half of the points in expectation, from which it follows that the total expected time for filtering is linear. Once an approximate value of d {\displaystyle d} is known, it can be used for the final steps of Rabin's algorithm; in these steps each grid point has a constant number of inputs rounded to it, so again the time is linear.10
The dynamic version for the closest-pair problem is stated as follows:
If the bounding box for all points is known in advance and the constant-time floor function is available, then the expected O ( n ) {\displaystyle O(n)} -space data structure was suggested that supports expected-time O ( log n ) {\displaystyle O(\log n)} insertions and deletions and constant query time. When modified for the algebraic decision tree model, insertions and deletions would require O ( log 2 n ) {\displaystyle O(\log ^{2}n)} expected time.11 The complexity of the dynamic closest pair algorithm cited above is exponential in the dimension d {\displaystyle d} , and therefore such an algorithm becomes less suitable for high-dimensional problems.
An algorithm for the dynamic closest-pair problem in d {\displaystyle d} dimensional space was developed by Sergey Bespamyatnikh in 1998.12 Points can be inserted and deleted in O ( log n ) {\displaystyle O(\log n)} time per point (in the worst case).
Shamos, Michael Ian; Hoey, Dan (1975). "Closest-point problems". 16th Annual Symposium on Foundations of Computer Science, Berkeley, California, USA, October 13-15, 1975. IEEE Computer Society. pp. 151–162. doi:10.1109/SFCS.1975.8. /wiki/Michael_Ian_Shamos ↩
Rabin, M. (1976). "Probabilistic algorithms". Algorithms and Complexity: Recent Results and New Directions. Academic Press. pp. 21–39. As cited by Khuller & Matias (1995). /wiki/Michael_O._Rabin ↩
Khuller, Samir; Matias, Yossi (1995). "A simple randomized sieve algorithm for the closest-pair problem". Information and Computation. 118 (1): 34–37. doi:10.1006/inco.1995.1049. MR 1329236. S2CID 206566076. /wiki/Samir_Khuller ↩
Lipton, Richard (24 September 2011). "Rabin Flips a Coin". Gödel's Lost Letter and P=NP. /wiki/Richard_Lipton ↩
Fortune, Steve; Hopcroft, John (1979). "A note on Rabin's nearest-neighbor algorithm". Information Processing Letters. 8 (1): 20–23. doi:10.1016/0020-0190(79)90085-1. hdl:1813/7460. MR 0515507. /wiki/John_Hopcroft ↩
Clarkson, Kenneth L. (1983). "Fast algorithms for the all nearest neighbors problem". 24th Annual Symposium on Foundations of Computer Science, Tucson, Arizona, USA, 7-9 November 1983. IEEE Computer Society. pp. 226–232. doi:10.1109/SFCS.1983.16. ISBN 0-8186-0508-1. 0-8186-0508-1 ↩
Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001) [1990]. "33.4: Finding the closest pair of points". Introduction to Algorithms (2nd ed.). MIT Press and McGraw-Hill. pp. 957–961. ISBN 0-262-03293-7. 0-262-03293-7 ↩
Kleinberg, Jon M.; Tardos, Éva (2006). "5.4 Finding the closest pair of points". Algorithm Design. Addison-Wesley. pp. 225–231. ISBN 978-0-321-37291-8. 978-0-321-37291-8 ↩
Golin, Mordecai; Raman, Rajeev; Schwarz, Christian; Smid, Michiel (1998). "Randomized data structures for the dynamic closest-pair problem" (PDF). SIAM Journal on Computing. 27 (4): 1036–1072. doi:10.1137/S0097539794277718. MR 1622005. S2CID 1242364. http://repository.ust.hk/ir/bitstream/1783.1-1429/1/27771.pdf ↩
Bespamyatnikh, S. N. (1998). "An optimal algorithm for closest-pair maintenance". Discrete & Computational Geometry. 19 (2): 175–195. doi:10.1007/PL00009340. MR 1600047. https://doi.org/10.1007%2FPL00009340 ↩