We use cookies to improve your experience with our site.

判别性二进制编码的多视图聚类算法

Discriminative Binary Multi-View Clustering

  • 摘要:
    研究背景 针对大规模数据,现有基于二进制编码的多视图聚类方法忽略了原始数据特征的冗余特征,并且在学习的二进制编码表示空间中无法保留数据的几何一致性。由于数据的数量和维度都不断增,使得多视图聚类算法获取数据的标签有较高的时间复杂度和空间复杂度。
    目的 提高将原始特征从实值空间映射到汉明空间中的二进制编码的质量,解决大规模数据的较高时间复杂度和聚类性能较低的问题,从而进一步提高聚类的性能。
    方法 为了使模型能够从原始数据中选择有用的特征并获得判别性表示,本文使用L_2,1 范数正则化来约束特征投影矩阵,以筛选出具有判别性的特征,有效地去除原始特征中的冗余和噪声信息。本文还引入了图正则化项,以保留学习到的二进制编码表示的几何一致性。这两个策略相互促进,共同作用,使得在汉明空间中用于聚类的一致性二进制编码表示具有高质量和判别性。通过在二进制嵌入过程中保持数据的局部结构,保留原始特征之间的相关性。通过将高维数据压缩为短二进制代码,减轻处理大规模多视角数据集时的高计算负担。
    结果和结论 在六个大规模多视角数据上进行的大量实验显示:本文方法在效果和效率方面优于基准方法,在应用于大规模聚类问题时更具吸引力。

     

    Abstract: Binary multi-view clustering has attracted intense attention from researchers due to its efficiency in handling large-scale datasets. However, previous clustering approaches suffer from at least two limitations. First, they ignore correlations among the features of original data. As a result, the geometric consistency of data is not preserved in the to-be-learnt binary representation space. Second, redundant and noisy features mixed in original data inevitably limit the ultimate clustering performance. In light of this, we propose a novel discriminative binary multi-view clustering (DBMVC) method to address the issues. Specifically, the proposed DBMVC first maps original data onto the Hamming space to obtain corresponding binary codes, which can effectively reduce the computational complexity and storage costs in the following steps. To enable our method to select useful features from original data and get a discriminative representation, the L_2,\,1-norm is used to constrain the feature projection matrix. In addition, a graph regularization term is further introduced to preserve the local manifold structure of the learned binary representation. Finally, an alternative iterative optimization algorithm is designed to solve the optimization problems of the objective function. Comprehensive experiments on six large-scale multi-view datasets validate that the proposed DBMVC markedly outperforms other state-of-the-art methods in terms of effectiveness and efficiency.

     

/

返回文章
返回