Efficiency analysis of NNS methods

Authors

  • Balázs Bolyki
  • Dávid Polonkai
  • László Kovács

Keywords:

NNS, VP-tree, LSH, Prefix tree, Nearest Neighbor, search, comparison

Abstract

Nearest Neighbor Search is a key operation in multiple information technologies fields, for example, string matching, plagiarism detection, natural language processing, image clustering, etc. It is crucial, that we have cost efficient methods and structures for retrieving data based on similarity. We conducted a survey of two popular - Vantage Point tree and Locality Sensitive hashing -, and one more recent - Prefix tree with clustering - NNS methods. In order to perform a wide range of tests on these algorithms, we adapted each to Python language, and developed multiple tests. In this paper we present the description of the three algorithms and the results of our tests. We aim to provide an informative comparison of the three major Nearest Neighbor Search structures.

Downloads

Published

2020-12-30