On The Geometry Of Similarity Search Dimensionality Curse And Concentration Of Measure | Awesome Learning to Hash Add your paper to Learning2Hash

On The Geometry Of Similarity Search Dimensionality Curse And Concentration Of Measure

Pestov Vladimir. Information Processing Letters 1999

[Paper]    
NEURIPS

We suggest that the curse of dimensionality affecting the similarity-based search in large datasets is a manifestation of the phenomenon of concentration of measure on high-dimensional structures. We prove that, under certain geometric assumptions on the query domain \(Ω\) and the dataset \(X\), if \(Ω\) satisfies the so-called concentration property, then for most query points \(x^\ast\) the ball of radius \((1+\e)d_X(x^\ast)\) centred at \(x^\ast\) contains either all points of \(X\) or else at least \(C_1\exp(-C_2\e^2n)\) of them. Here \(d_X(x^\ast)\) is the distance from \(x^\ast\) to the nearest neighbour in \(X\) and \(n\) is the dimension of \(Ω\).

Similar Work