Combining <i>K</i>NN with AutoEncoder for Outlier Detection

Shu-Zheng Liu; Shuai Ma; Han-Qing Chen; Li-Zhen Cui; Jie Ding

doi:10.1007/s11390-023-2403-y

Liu SZ, Ma S, Chen HQ et al. Combining KNN with AutoEncoder for outlier detection. JOURNAL OFCOMPUTER SCIENCE AND TECHNOLOGY 39(5): 1153−1166 Sept. 2024. DOI: 10.1007/s11390-023-2403-y.

Citation:

Liu SZ, Ma S, Chen HQ et al. Combining KNN with AutoEncoder for outlier detection. JOURNAL OFCOMPUTER SCIENCE AND TECHNOLOGY 39(5): 1153−1166 Sept. 2024. DOI: 10.1007/s11390-023-2403-y.

Citation:

Liu SZ, Ma S, Chen HQ et al. Combining KNN with AutoEncoder for outlier detection. JOURNAL OFCOMPUTER SCIENCE AND TECHNOLOGY 39(5): 1153−1166 Sept. 2024. DOI: 10.1007/s11390-023-2403-y.

Combining KNN with AutoEncoder for Outlier Detection

Abstract

Abstract

K-nearest neighbor (KNN) is one of the most fundamental methods for unsupervised outlier detection because of its various advantages, e.g., ease of use and relatively high accuracy. Currently, most data analytic tasks need to deal with high-dimensional data, and the KNN-based methods often fail due to “the curse of dimensionality”. AutoEncoder-based methods have recently been introduced to use reconstruction errors for outlier detection on high-dimensional data, but the direct use of AutoEncoder typically does not preserve the data proximity relationships well for outlier detection. In this study, we propose to combine KNN with AutoEncoder for outlier detection. First, we propose the Nearest Neighbor AutoEncoder (NNAE) by persevering the original data proximity in a much lower dimension that is more suitable for performing KNN. Second, we propose the K-nearest reconstruction neighbors (KNRNs) by incorporating the reconstruction errors of NNAE with the K-distances of KNN to detect outliers. Third, we develop a method to automatically choose better parameters for optimizing the structure of NNAE. Finally, using five real-world datasets, we experimentally show that our proposed approach NNAE+KNRN is much better than existing methods, i.e., KNN, Isolation Forest, a traditional AutoEncoder using reconstruction errors (AutoEncoder-RE), and Robust AutoEncoder.

FullText(HTML)

References (49)

Relative Articles

Supplements (3)

Cited By

Combining KNN with AutoEncoder for Outlier Detection

Abstract

Catalog

Export File

Citation

Format

Content