Journal of Computer Science and Technology

   

Label-Aware Chinese Event Detection with Heterogeneous Graph Attention Network

Shi-Yao Cui1,2(崔诗尧), Bo-Wen Yu1,2(郁博文), Xin Cong1,2(从鑫), Ting-Wen Liu1,2,∗(柳厅文), Member, CCF, Qing-Feng Tan3(谭庆丰), Member, CCF, IEEE, and Jin-Qiao Shi4(时金桥)   

  1. 1Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100190, China
    2School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100049, China
    3Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou 510006, China
    4School of Cyber Security, Beijing University of Posts and Telecommunications, Beijing 100088, China
  • Received:2021-04-23 Revised:2022-11-09 Accepted:2023-01-02
  • Contact: Ting-Wen Liu E-mail:liutingwen@iie.ac.cn
  • About author:Ting-Wen Liu is a professor of the Institute of Information Engineering, Chinese Academy of Sciences and the University of Chinese Academy of Sciences. He is also a member of Youth Innovation Promotion Association CAS and CCF. He received his Ph.D. degree from the Institute of Computing Technology, Chinese Academy of Sciences in 2013. His research interests include information extraction, text matching and knowledge graph etc.

Event Detection (ED) seeks to recognize event triggers and classify them into the predefined event types. Chinese ED is formulated as a character-level task owing to the uncertain word boundaries. Prior methods try to incorporate word-level information into characters to enhance their semantics. However, they experience two problems. First, they fail to incorporate word-level information into each character it encompasses, causing the insufficient word-character interaction problem. Second, they struggle to distinguish events of similar types with limited annotated instances, which is called the event confusing problem. This paper proposes a novel method named label-aware heterogeneous graph attention network (L-HGAT) to address these two problems. Specifically, we first build a heterogeneous graph of two node types and three edge types to maximally preserve word-character interactions, and then deploy heterogeneous graph attention network to enhance the semantic propagation between characters and words. Furthermore, we design a pushing-away game to enlarge the predicting gap between ground-truth event type and its confusing counterpart for each character. Experimental results show that our L-HGAT model consistently achieves superior performance over prior competitive methods.


中文摘要

1、研究背景(context)
事件检测旨在发现给定文本中的事件触发词、并判断事件所属的类型。不同于英文中的事件检测一般建模在单词级别上,由于中文缺少天然的词划分,中文事件检测一般被建模在字符级别上进行。然而,由于字符本身的语义局限性,往往需要引入词级别的语义来进行语义的增强。现有工作已经尝试将字词级别的语义融合进行中文事件检测,但字词语义交互不充分和相似事件类型混淆这两大问题仍旧局限了中文事件检测性能的提升。
2、目的(Objective)
我们的研究目的是, 设计一种能够在中文事件检测中充分对字词的语义交互进行建模、并能够对混淆事件标签语义进行有效区分的方法。
3、方法(Method)
本文提出了一种基于异质图网络及标签感知的中文事件检测方法来完成研究目的。针对字词交互的建模问题,我们构建了能够最大限度地保留字词交互的异质图,并通过异构图注意网络来增强字词之间的语义传播;针对混淆标签的区分,我们设计了一个匹配模块来探索事件标签的语义,进而对易混淆事件标签进行区分。
4、结果(Result & Findings)
我们的方法在目前最广泛使用的事件检测数据集ACE2005和KBP2017上取得了非常好的性能,并超过了以往的中文事件检测方法。此外,我们还进行了一系列分析实验来验证我们模型不同模块的有效性。
5、结论(Conclusions)
通过实验与分析,我们得到关于中文事件检测的两条结论: 1) 引入词级别的信息对字符进行语义增强并对字词间的语义交互进行充分建模,对中文事件检测是十分必要的;2) 探索事件标签所蕴含的语义并对通过Margin Loss进行学习,能够有效地对异混淆事件进行区分。在未来的工作中,我们计划将我们的方法应用到中文命名实体识别任务中去。

Key words: Chinese event detection; heterogeneous graph attention network; label embedding;

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved