Subgroup Discovery Algorithms: A Survey and Empirical Evaluation
-
Abstract
Subgroup discovery is a data mining technique that discovers interesting associations among different variables with respect to a property of interest. Existing subgroup discovery methods employ different strategies for searching, pruning and ranking subgroups. It is very crucial to learn which features of a subgroup discovery algorithm should be considered for generating quality subgroups. In this regard, a number of reviews have been conducted on subgroup discovery. Although they provide a broad overview on some popular subgroup discovery methods, they employ few datasets and measures for subgroup evaluation. In the light of the existing measures, the subgroups cannot be appraised from all perspectives. Our work performs an extensive analysis on some popular subgroup discovery methods by using a wide range of datasets and by defining new measures for subgroup evaluation. The analysis result will help with understanding the major subgroup discovery methods, uncovering the gaps for further improvement and selecting the suitable category of algorithms for specific application domains.
-
-