Efficiency analysis of NNS methods
Keywords:
NNS, VP-tree, LSH, Prefix tree, Nearest Neighbor, search, comparisonAbstract
Nearest Neighbor Search is a key operation in multiple information technologies fields, for example, string matching, plagiarism detection, natural language processing, image clustering, etc. It is crucial, that we have cost efficient methods and structures for retrieving data based on similarity. We conducted a survey of two popular - Vantage Point tree and Locality Sensitive hashing -, and one more recent - Prefix tree with clustering - NNS methods. In order to perform a wide range of tests on these algorithms, we adapted each to Python language, and developed multiple tests. In this paper we present the description of the three algorithms and the results of our tests. We aim to provide an informative comparison of the three major Nearest Neighbor Search structures.