ConfDTree：Statistical Methods for Improving Decision Trees

Gilad Katz; Asaf Shabtai; Lior Rokach; Nir Ofek

doi:10.1007/s11390-014-1438-5

Gilad Katz, Asaf Shabtai, Lior Rokach, Nir Ofek. ConfDTree：Statistical Methods for Improving Decision Trees[J]. Journal of Computer Science and Technology, 2014, 29(3): 392-407. DOI: 10.1007/s11390-014-1438-5

Citation:

ConfDTree：Statistical Methods for Improving Decision Trees

Abstract

Abstract

Decision trees have three main disadvantages: reduced performance when the training set is small; rigid decision criteria; and the fact that a single "uncharacteristic" attribute might "derail" the classification process. In this paper we present ConfDTree (Confidence-Based Decision Tree)——a post-processing method that enables decision trees to better classify outlier instances. This method, which can be applied to any decision tree algorithm, uses easy-to-implement statistical methods (confidence intervals and two-proportion tests) in order to identify hard-to-classify instances and to propose alternative routes. The experimental study indicates that the proposed post-processing method consistently and significantly improves the predictive performance of decision trees, particularly for small, imbalanced or multi-class datasets in which an average improvement of 5%~9% in the AUC performance is reported.

FullText(HTML)

References (36)

Relative Articles

Supplements (0)

Cited By

ConfDTree：Statistical Methods for Improving Decision Trees

Abstract

Catalog

Export File

Citation

Format

Content