从监督视角进行多示例学习

从监督视角进行多示例学习

Multi-Instance Learning from Supervised View

Zhi-Hua Zhou

摘要

摘要: 在多示例学习中，训练样本是由多个示例组成的包，包是有概念标记的，但示例本身却没有概念标记。如果一个包中至少包含一个正例，则该包是一个正包，否则即为反包。学习的目的是预测新包的类别。由于多示例学习具有独特的性质，被认为是一种与监督学习、非监督学习、强化学习并列的一种新的学习框架。在提出多示例学习的概念时，T.G. Dietterich等人指出，该领域一个非常值得研究的课题是如何对常用的机器学习方法进行改造，使它们可以处理多示例学习任务。该问题对多示例学习领域的发展起到了推动作用。经过很多学者的研究，常用的机器学习方法基本上都有了对应的多示例学习版本，但遗憾的是，不同的学习方法在向多示例学习转化时并没有一个一般性的方法或法则。本文分析了若干具有代表性的多示例学习算法，揭示出监督学习方法可以转化为多示例学习方法，只需将学习方法的注意焦点从对示例的区分转变到对包的区分。这个一般性的法则揭示了多示例学习与监督学习之间的联系，为多示例学习算法的设计提供了一种通用的途径。集成学习利用多个个体学习器来解决同一个问题，以往研究表明，集成学习技术能够显著提高监督学习器的泛化能力。由于本文已经揭示出多示例学习与监督学习之间具有密切联系，因此本文提出通过建立多示例集成来求解多示例学习问题。在一个真实世界基准测试上，多示例集成取得了很好的效果，显著了提高多种多示例学习器的泛化能力，并获得了迄今最高的精度记录。

Abstract: In multi-instance learning, the training set comprises labeled bags that are composed of unlabeled instances, and the task is topredict the labels of unseen bags. This paper studies multi-instancelearning from the view of supervised learning. First, by analyzing somerepresentative learning algorithms, this paper shows that multi-instancelearners can be derived from supervised learners by shifting theirfocuses from the discrimination on the instances to the discriminationon the bags. Second, considering that ensemble learning paradigms caneffectively enhance supervised learners, this paper proposes to buildmulti-instance ensembles to solve multi-instance problems. Experimentson a real-world benchmark test show that ensemble learning paradigms cansignificantly enhance multi-instance learners.

HTML全文

参考文献()

施引文献

资源附件()