New Instability Results For High Dimensional Nearest Neighbor Search | Awesome Learning to Hash Add your paper to Learning2Hash

New Instability Results For High Dimensional Nearest Neighbor Search

Chris Giannella . Information Processing Letters 109(19) 2009 2009 – 0 citations

[Paper]   Search on Google Scholar   Search on Semantic Scholar
Datasets

Consider a dataset of n(d) points generated independently from R^d according to a common p.d.f. f_d with support(f_d) = [0,1]^d and sup{f_d([0,1]^d)} growing sub-exponentially in d. We prove that: (i) if n(d) grows sub-exponentially in d, then, for any query point q^d in [0,1]^d and any epsilon>0, the ratio of the distance between any two dataset points and q^d is less that 1+epsilon with probability –>1 as d–>infinity; (ii) if n(d)>[4(1+epsilon)]^d for large d, then for all q^d in [0,1]^d (except a small subset) and any epsilon>0, the distance ratio is less than 1+epsilon with limiting probability strictly bounded away from one. Moreover, we provide preliminary results along the lines of (i) when f_d=N(mu_d,Sigma_d).

Similar Work