We use cookies to improve your experience with our site.

基于搜索引擎丰富上下文信息的实体链接方法

Enriching Context Information for Entity Linking with Web Data

  • 摘要: 实体链接是指把文本中的待链接实体链接到一个定义好的知识库中,消除歧义。现有的方法主要式利用“本地”信息(待链接实体的上下文本信息)或“全局”信息(候选实体之间的关系)。但是,这些方法在一定程度上都存在文本不足的问题,尤其是短文本数据。为了获得更加丰富的本地和全局信息,我们提出通过网络搜索引擎(WSE)从网络上获取额外的相关文本,对待链接实体上下文进行补充。基于此,我们提出了两种优化实体链接的方法。第一种方法是将网络搜索结果添加到基于词嵌入的实体链接方法中,扩充待链接实体的上下文本,从而提高链接效果。第二种方法是使用网络文本扩展全局信息,通过额外文本,挑选合适的命名实体加入图模型的构建,从而提高基于图的实体链接方法效果。在处理搜索引擎结果文本时,我们采用了两种方法。一种是应用注意力机制,另一种是使用抽象提取方法。最后,可以结合我们提出的两个优化模型,从本地和全局两方面共同提高实体链接效果。我们在六个现实数据集中的实验结果表明,使用额外的搜索引擎文本扩展本地和全局信息可以有效提高实体链接的F1值。

     

    Abstract: Entity linking (EL) is the task of determining the identity of textual entity mentions given a predefined knowledge base (KB). Plenty of existing efforts have been made on this task using either "local" information (contextual information of the mention in the text), or "global" information (relations among candidate entities). However, either local or global information might be insufficient especially when the given text is short. To get richer local and global information for entity linking, we propose to enrich the context information for mentions by getting extra contexts from the web through web search engines (WSE). Based on the intuition above, two novel attempts are made. The first one adds web-searched results into an embedding-based method to expand the mention's local information, where we try two different methods to help generate high-quality web contexts:one is to apply the attention mechanism and the other is to use the abstract extraction method. The second one uses the web contexts to extend the global information, i.e., finding and utilizing more extra relevant mentions from the web contexts with a graph-based model. Finally, we combine the two models we propose to use both extended local and global information from the extra web contexts. Our empirical study based on six real-world datasets shows that using extra web contexts to extend the local and the global information could effectively improve the F1 score of entity linking.

     

/

返回文章
返回