Composite Distance Transformation for Indexing and k-Nearest-Neighbor Searching in High-Dimensional Spaces
-
Abstract
Due to the famous dimensionality curse problem, searchin a high-dimensional space is considered as a ``hard'' problem. In thispaper, a novel \it composite distance transformation method, which iscalled CDT, is proposed to support a fast k-nearest-neighbor(k-NN) search in high-dimensional spaces. In CDT, all (n) datapoints are first grouped into some clusters by a k-Means clusteringalgorithm. Then a composite distance key of each data point is computed.Finally, these index keys of such n data points are inserted by apartition-based B^+-tree. Thus, given a query point, its k-NN search inhigh-dimensional spaces is transformed into the search in the singledimensional space with the aid of CDT index. Extensive performancestudies are conducted to evaluate the effectiveness and efficiency ofthe proposed scheme. Our results show that this method outperforms thestate-of-the-art high-dimensional search techniques, such as the X-Tree,VA-file, iDistance and NB-Tree.
-
-