Fast Approximate Complete-data k-nearest-neighbor Estimation

Authors

  • Alejandro Murua
  • Nicolas Wicker University of Lille

DOI:

https://doi.org/10.17713/ajs.v49i2.907

Abstract

We introduce a fast method to estimate the complete-data set of k-nearest-neighbors.This is equivalent to finding an estimate of the k-nearest-neighbor graph of the data. The method relies on random normal projections. The k-nearest-neighbors are estimated by sorting points in a number of random lines. For very large datasets, the method is quasi-linear in the data size. As an application, we show that the intrinsic dimension of a manifold can be reliably estimated from the estimated set of k-nearest-neighbors in time about two orders of magnitude faster than when using the exact set of k-nearest-neighbors.

Published

2020-02-20

How to Cite

Murua, A., & Wicker, N. (2020). Fast Approximate Complete-data k-nearest-neighbor Estimation. Austrian Journal of Statistics, 49(2), 18-30. https://doi.org/10.17713/ajs.v49i2.907