We use cookies to improve your experience with our site.
Qing-Ying Yu, Ge-Ge Shi, Dong-Sheng Xu, Wen-Kai Wang, Chuan-Ming Chen, Yong-Long Luo. Density Peak Clustering Algorithm Based on Data Field Theory and Grid Similarity[J]. Journal of Computer Science and Technology. DOI: 10.1007/s11390-023-2984-5
Citation: Qing-Ying Yu, Ge-Ge Shi, Dong-Sheng Xu, Wen-Kai Wang, Chuan-Ming Chen, Yong-Long Luo. Density Peak Clustering Algorithm Based on Data Field Theory and Grid Similarity[J]. Journal of Computer Science and Technology. DOI: 10.1007/s11390-023-2984-5

Density Peak Clustering Algorithm Based on Data Field Theory and Grid Similarity

  • The density peak clustering algorithm can rapidly identify cluster centers by drawing decision graphs without any prior knowledge; however, when multiple density peaks are present in one cluster of the da-taset, the cluster centers cannot be accurately obtained, leading to incorrect clustering. Moreover, the single-step allocation strategy is poorly fault-tolerant and can lead to successive allocation errors. To ad-dress these problems, this study proposes a density peak clustering method based on data field theory and grid similarity, which divides the original data into grid spaces to obtain subspace codes and grid data. Subsequently, the potential grid density is introduced to measure the density of each grid based on the data field, and the density peaks are identified to obtain the cluster centers from the high-density grids. Finally, self-organized clustering is realized based on neighborhood extension centers and grid similarity. It thus reduces the possibility of successive assignment errors and the negative impact of multiple densi-ty peaks on the clustering results. In addition, the algorithm automatically determines the cluster centers and can correctly assign the non-density-peak grid to the corresponding clusters. Experimental results with synthetic and real-world datasets demonstrate that the proposed algorithm outperforms similar al-gorithms in terms of accuracy and efficiency when dealing with complex datasets with large density dif-ferences and cross-tangling.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return