Fast Approximate Complete-data k-nearest-neighbor Estimation


  • Alejandro Murua
  • Nicolas Wicker University of Lille



We introduce a fast method to estimate the complete-data set of k-nearest-neighbors.This is equivalent to finding an estimate of the k-nearest-neighbor graph of the data. The method relies on random normal projections. The k-nearest-neighbors are estimated by sorting points in a number of random lines. For very large datasets, the method is quasi-linear in the data size. As an application, we show that the intrinsic dimension of a manifold can be reliably estimated from the estimated set of k-nearest-neighbors in time about two orders of magnitude faster than when using the exact set of k-nearest-neighbors.



How to Cite

Murua, A., & Wicker, N. (2020). Fast Approximate Complete-data k-nearest-neighbor Estimation. Austrian Journal of Statistics, 49(2), 18-30.