We use cookies to improve your experience with our site.

子群发现算法:综述和经验性评价

Subgroup Discovery Algorithms: A Survey and Empirical Evaluation

  • 摘要: 子群发现是一种数据挖掘技术, 用于发现与某项兴趣相关的不同变量间的有趣的关联。现有子群发现方法使用不同的子群搜索、修剪和排名策略。对于一个子群发现算法而言, 使用哪些属性以生成高质量的子群至关重要。因此, 已有大量关于子群发现的综述。虽然它们综合概述了很多流行的子群发现方法, 但进行子群评价时使用的数据库和度量元很少。而且, 现有的度量指标不能全面地评价所得到的子群。本文使用大量数据集并定义了评价子群的新的度量指标, 广泛地分析了一些流行的子群发现方法。分析结果有助于理解主要的子群发现方法, 可用于进一步完善子群发现方法, 并为具体应用领域挑选合适的算法。

     

    Abstract: Subgroup discovery is a data mining technique that discovers interesting associations among different variables with respect to a property of interest. Existing subgroup discovery methods employ different strategies for searching, pruning and ranking subgroups. It is very crucial to learn which features of a subgroup discovery algorithm should be considered for generating quality subgroups. In this regard, a number of reviews have been conducted on subgroup discovery. Although they provide a broad overview on some popular subgroup discovery methods, they employ few datasets and measures for subgroup evaluation. In the light of the existing measures, the subgroups cannot be appraised from all perspectives. Our work performs an extensive analysis on some popular subgroup discovery methods by using a wide range of datasets and by defining new measures for subgroup evaluation. The analysis result will help with understanding the major subgroup discovery methods, uncovering the gaps for further improvement and selecting the suitable category of algorithms for specific application domains.

     

/

返回文章
返回