在细粒度体系下为实体分类的关系增强神经模型

doi:10.1007/s11390-017-1762-7

在细粒度体系下为实体分类的关系增强神经模型

Relation Enhanced Neural Model for Type Classification of Entity Mentions with a Fine-Grained Taxonomy

摘要

摘要: 推断一句话中的实体的类型是一项必要但是富有挑战性的任务。大多数现有方法都用非常粗粒度的分类体系，由于这些体系中的类型都太宽泛，对很多其他应用来说都不够细致准确。然而，如果我们直接把这些分类体系扩展成有几百类的细粒度体系，现有方法的性能明显下降。在这篇文章中，我们介绍一种混合神经网络模型，它可以在细粒度体系下很好地推断实体的类型。模型包含四个部分，实体部分、上下文部分、关系部分、已知类型部分，它们分别用来从待分类实体、上下文、待分类实体与上下文实体之间的关系、上下文实体的类型中抽取特征。从这四个部分学来的特征被拼接到一起然后输入softmax层得到相应的预测类型。为了评估模型我们做了大量的实验。实验结果表明，在FIGER数据集上我们的模型取得了很好效果。另外，我们还使用维基百科和DBpedia创建了更大的数据集。在这些更大的数据集上，我们的模型和现有先进模型相比，在粗粒度体系下取得了相当甚至更好的结果；并且在细粒度体系下，我们的模型在micro-F1，macro-F1和weighted-F1指标上远远好于现有先进模型。

Abstract: Inferring semantic types of the entity mentions in a sentence is a necessary yet challenging task. Most of existing methods employ a very coarse-grained type taxonomy, which is too general and not exact enough for many tasks. However, the performances of the methods drop sharply when we extend the type taxonomy to a fine-grained one with several hundreds of types. In this paper, we introduce a hybrid neural network model for type classification of entity mentions with a fine-grained taxonomy. There are four components in our model, namely, the entity mention component, the context component, the relation component, the already known type component, which are used to extract features from the target entity mention, context, relations and already known types of the entity mentions in surrounding context respectively. The learned features by the four components are concatenated and fed into a softmax layer to predict the type distribution. We carried out extensive experiments to evaluate our proposed model. Experimental results demonstrate that our model achieves state-of-the-art performance on the FIGER dataset. Moreover, we extracted larger datasets from Wikipedia and DBpedia. On the larger datasets, our model achieves the comparable performance to the state-of-the-art methods with the coarse-grained type taxonomy, but performs much better than those methods with the fine-grained type taxonomy in terms of micro-F1, macro-F1 and weighted-F1.

HTML全文

参考文献()

施引文献

资源附件()