We use cookies to improve your experience with our site.

利用主动查询的增量多标记学习

Incremental Multi-Label Learning with Active Queries

  • 摘要: 在多标记学习中,因为每个实例同时包含多个标签,所以标注实例的代价十分高昂。因此,通过选择性地查询最有价值实例的标签来减少标注代价的主动学习方法对于多标记学习变得尤为重要。一个好的多标记主动学习算法通常包含两个关键要素:一个是用于评估查询一个实例的标记所带来收益的合理的主动查询标准,以及一个有效分类模型,基于该模型的主动查询标准能够精确地计算。在本文中,我们首先提出一种通过将标签排序与阈值学习相结合的有效多标签分类模型,该模型能够增量训练以避免每次查询后从头开始训练。然后,基于该模型,我们提出一种能在实例空间和标签空间中同时利用不确定性和多样性的方法,并主动查询那些可以最大限度提升模型分类效果的实例标签对。在20个数据集上进行的充分实验证明了所提出的方法相比于当前最优秀方法的优越性。

     

    Abstract: In multi-label learning, it is rather expensive to label instances since they are simultaneously associated with multiple labels. Therefore, active learning, which reduces the labeling cost by actively querying the labels of the most valuable data, becomes particularly important for multi-label learning. A good multi-label active learning algorithm usually consists of two crucial elements:a reasonable criterion to evaluate the gain of querying the label for an instance, and an effective classification model, based on whose prediction the criterion can be accurately computed. In this paper, we first introduce an effective multi-label classification model by combining label ranking with threshold learning, which is incrementally trained to avoid retraining from scratch after every query. Based on this model, we then propose to exploit both uncertainty and diversity in the instance space as well as the label space, and actively query the instance-label pairs which can improve the classification model most. Extensive experiments on 20 datasets demonstrate the superiority of the proposed approach to state-of-the-art methods.

     

/

返回文章
返回