计算机科学技术学报 ›› 2021,Vol. 36 ›› Issue (1): 207-220.doi: 10.1007/s11390-020-9713-0

所属专题: Artificial Intelligence and Pattern Recognition

• • 上一篇    下一篇

基于对抗神经网络的跨语言实体关系分类

Bo-Wei Zou1, 2, Member, CCF, Rong-Tao Huang1, Zeng-Zhuang Xu1 Yu Hong1,*, Member, CCF, and Guo-Dong Zhou1, Member, CCF, ACM, IEEE   

  1. 1 School of Computer Science and Technology, Soochow University, Suzhou 215006, China;
    2 Institute for Infocomm Research, Agency for Science, Technology and Research, Singapore 138635, Singapore
  • 收稿日期:2019-05-13 修回日期:2020-04-15 出版日期:2021-01-05 发布日期:2021-01-23
  • 通讯作者: Yu Hong E-mail:fyhong@suda.edu.cn
  • 作者简介:Bo-Wei Zou received his Ph.D. degree in computer science and technology from Soochow University, Suzhou, in 2015, and his M.S. and B.S. degrees in computer science and technology from Harbin Institute of Technology, Harbin, in 2009 and 2007, respectively. He is currently a scientist of Institute for Infocomm Research, A*STAR, Singapore. His main research interests include information extraction, question answering, and conversation comprehension. He is a member of CCF.
  • 基金资助:
    This work was supported by the National Natural Science Foundation of China under Grant Nos. 61703293, 61751206, and 61672368.

Language Adaptation for Entity Relation Classification via Adversarial Neural Networks

Bo-Wei Zou1,2, Member, CCF, Rong-Tao Huang1, Zeng-Zhuang Xu1 Yu Hong1,*, Member, CCF, and Guo-Dong Zhou1, Member, CCF, ACM, IEEE        

  1. 1 School of Computer Science and Technology, Soochow University, Suzhou 215006, China;
    2 Institute for Infocomm Research, Agency for Science, Technology and Research, Singapore 138635, Singapore
  • Received:2019-05-13 Revised:2020-04-15 Online:2021-01-05 Published:2021-01-23
  • Contact: Yu Hong E-mail:fyhong@suda.edu.cn
  • About author:Bo-Wei Zou received his Ph.D. degree in computer science and technology from Soochow University, Suzhou, in 2015, and his M.S. and B.S. degrees in computer science and technology from Harbin Institute of Technology, Harbin, in 2009 and 2007, respectively. He is currently a scientist of Institute for Infocomm Research, A*STAR, Singapore. His main research interests include information extraction, question answering, and conversation comprehension. He is a member of CCF.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China under Grant Nos. 61703293, 61751206, and 61672368.

实体关系分类作为信息抽取研究中的重要研究方向,其任务是识别句子中两个实体之间的语义关系。然而,现有研究大多集中在针对单一语言上,如英文,而忽略了面向其它语言的研究,尤其是缺乏大量实体关系标注的语言。因此,如何有效地利用富含标注信息的语言,来帮助标注信息匮乏的语言构建实体关系分类模型,目前仍是一个具有挑战性的课题。针对该问题,本文提出了一种跨语言实体关系分类模型,该模型利用对抗神经网络,通过特征编码器和语言判别器之间的竞争,在目标语言上模仿源语言的特征编码方式,从而针对不同语言上的特征表示进行迁移学习,以解决语言适应性问题。此外,本文还验证了单通道和双通道两种对抗结构在实体关系分类任务上的有效性。在ACE 2005多语言实体关系分类语料上的实验表明,单通道对抗神经网络模型在无监督和半监督实体关系分类上均取得了最好性能,比目前的最好系统分别提升了6.61%和2.98%。本文还与直接在源语言或目标语言采用机器翻译的跨语言模型进行了比较,结果显示,本文提出的单通道和双通道实体关系分类模型的性能均显著提高。此外,本文在实验分析部分还全面地探讨了主要模块和参数设置对模型的影响,为相关研究提供了参考和判断的依据。未来研究工作将集中在对抗神经网络在多语言情景中的作用,以及其在其它自然语言处理任务上的表现。

关键词: 对抗神经网络, 实体关系分类, 语言适应性

Abstract: Entity relation classification aims to classify the semantic relationship between two marked entities in a given sentence, and plays a vital role in various natural language processing applications. However, existing studies focus on exploiting mono-lingual data in English, due to the lack of labeled data in other languages. How to effectively benefit from a richly-labeled language to help a poorly-labeled language is still an open problem. In this paper, we come up with a language adaptation framework for cross-lingual entity relation classification. The basic idea is to employ adversarial neural networks (AdvNN) to transfer feature representations from one language to another. Especially, such a language adaptation framework enables feature imitation via the competition between a sentence encoder and a rival language discriminator to generate effective representations. To verify the effectiveness of AdvNN, we introduce two kinds of adversarial structures, dual-channel AdvNN and single-channel AdvNN. Experimental results on the ACE 2005 multilingual training corpus show that our single-channel AdvNN achieves the best performance on both unsupervised and semi-supervised scenarios, yielding an improvement of 6.61% and 2.98% over the state-of-the-art, respectively. Compared with baselines which directly adopt a machine translation module, we find that both dual-channel and single-channel AdvNN significantly improve the performances (F1) of cross-lingual entity relation classification. Moreover, extensive analysis and discussion demonstrate the appropriateness and effectiveness of different parameter settings in our language adaptation framework.

Key words: adversarial neural network, entity relation classification, language adaptation

[1] Yu M, Yin W, Hasan K S, dos Santos C, Xiang B, Zhou B. Improved neural relation detection for knowledge base question answering. In Proc. the 55th Annual Meeting of the Association for Computational Linguistics, July 2017, pp.571-581. DOI:10.18653/v1/P17-1053.
[2] Min B, Grishman R. Challenges in the knowledge base population slot filling task. In Proc. the 8th International Conference on Language Resources and Evaluation, May 2012, pp.1137-1142.
[3] Liu Y, Li S, Wei F, Ji H. Relation classification via modeling augmented dependency paths. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2016, 24(9):1589-1598. DOI:10.1109/TASLP.2016.2573050.
[4] Qian L, Hui H, Hu Y, Zhou G, Zhu Q. Bilingual active learning for relation classification via pseudo parallel corpora. In Proc. the 52nd Annual Meeting of the Association for Computational Linguistics, June 2014, pp.582-592. DOI:10.3115/v1/P14-1055.
[5] Min B, Jiang Z, Freedman M, Weischedel R. Learning transferable representation for bilingual relation extraction via convolutional neural networks. In Proc. the 8th International Joint Conference on Natural Language Processing, November 2017, pp.674-684.
[6] Bousmalis K, Trigeorgis G, Silberman N, Krishnan D, Erhan D. Domain separation networks. In Proc. the 29th Conference on Neural Information Processing Systems, December 2016, pp.343-351.
[7] Park G, Im W. Image-text multi-modal representation learning by adversarial backpropagation. arXiv:1612.08354, 2016. https://arxiv.org/abs/1612.08354, Dec. 2019.
[8] Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V. Domainadversarial training of neural networks. Journal of Machine Learning Research, 2016, 17:Article No. 59. DOI:10.555/2946645.2946704.
[9] Qin L, Zhang Z, Zhao H, Hu Z, Xing E P. Adversarial connective-exploiting networks for implicit discourse relation classification. In Proc. the 55th Annual Meeting of the Association for Computational Linguistics, July 2017, pp.1006-1017. DOI:10.18653/v1/P17-1093.
[10] Zou B, Xu Z, Hong Y, Zhou G. Adversarial feature adaptation for cross-lingual relation classification. In Proc. the 27th International Conference on Computational Linguistics, August 2018, pp.437-448.
[11] Hendrickx I, Kim S N, Kozareva Z et al. SemEval-2010 task 8:Multiway classification of semantic relations between pairs of nominals. In Proc. the 5th International Workshop on Semantic Evaluation, July 2010, pp.33-38.
[12] Gábor K, Buscaldi D, Schumann A K, QasemiZadeh B, Zargayouna H, Charnois T. SemEval-2018 Task 7:Semantic relation extraction and classification in scientific papers. In Proc. the 12th International Workshop on Semantic Evaluation, June 2018, pp.679-688. DOI:10.18653/v1/S18- 1111.
[13] Kambhatla N. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In Proc. the 42nd Annual Meeting on Association for Computational Linguistics on Interactive Poster and Demonstration Sessions, July 2004, Article No. 22. DOI:10.3115/1219044.1219066.
[14] Suchanek F M, Ifrim G, Weikum G. Combining linguistic and statistical analysis to extract relations from web documents. In Proc. the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2006, pp.712-717. DOI:10.1145/1150402.1150492.
[15] Bunescu R C, Mooney R J. Subsequence kernels for relation extraction. In Proc. International Conference on Neural Information Processing Systems, December 2005, pp.171-178. DOI:10.5555/2976248.2976270.
[16] Qian L, Zhou G, Kong F, Zhu Q, Qian P. Exploiting constituent dependencies for tree kernel-based semantic relation extraction. In Proc. the 22nd International Conference on Computational Linguistics, August 2008, pp.697-704. DOI:10.3115/1599081.1599169.
[17] Zeng D, Liu K, Chen Y, Zhao J. Distant supervision for relation extraction via piecewise convolutional neural networks. In Proc. the 2015 Conference on Empirical Methods in Natural Language Processing, September 2015, pp.1753- 1762. DOI:10.18653/v1/D15-1203.
[18] dos Santos C N, Xiang B, Zhou B. Classifying relations by ranking with convolutional neural networks. In Proc. the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, July 2015, pp.626-634. DOI:10.3115/v1/P15-1061.
[19] Wang L, Cao Z, de Melo G, Liu Z. Relation classification via multi-level attention CNNs. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, August 2016, pp.1298-1307. DOI:10.18653/v1/P16-1123.
[20] Kim S, Jeong M, Lee J, Lee G G. Cross-lingual annotation projection for weakly-supervised relation extraction. ACM Transactions on Asian Language Information Processing, 2014, 13(1):Article No. 3. DOI:10.1145/2529994.
[21] Faruqui M, Kumar S. Multilingual open relation extraction using cross-lingual projection. In Proc. the 2015 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, May 2015, pp.1351-1356. DOI:10.3115/v1/N15-1151.
[22] Verga P, Belanger D, Strubell E, Roth B, McCallum A. Multilingual relation extraction using compositional universal schema. In Proc. the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, June 2016, pp.886- 896. DOI:10.18653/v1/N16-1103.
[23] Lin Y, Liu Z, Sun M. Neural relation extraction with multilingual attention. In Proc. the 55th Annual Meeting of the Association for Computational Linguistics, July 2017, pp.34-43. DOI:10.18653/v1/P17-1004.
[24] Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, WardeFarley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. In Proc. Annual Conference on Neural Information Processing Systems Conference, December 2014, pp.2672-2680.
[25] Makhzani A, Shlens J, Jaitly N, Goodfellow I, Frey B. Adversarial autoencoders. In Proc. the International Conference on Learning Representations, May 2016.
[26] Fu L, Nguyen T H, Min B, Grishman R. Domain adaptation for relation extraction with domain adversarial neural network. In Proc. the 8th International Joint Conference on Natural Language Processing, November 2017, pp.425-429.
[27] Chen X, Sun Y, Athiwaratkun B, Cardie C, Weinberger K. Adversarial deep averaging networks for crosslingual sentiment classification. arXiv:1606.01614, 2016. https://arxiv.org/abs/1606.01614, Aug. 2018.
[28] Sato M, Manabe H, Noji H, Matsumoto Y. Adversarial training for cross-domain universal dependency parsing. In Proc. the CoNLL 2017 Shared Task:Multilingual Parsing from Raw Text to Universal Dependencies, August 2017, pp.71-79. DOI:10.18653/v1/K17-3007.
[29] Huang L, Ji H, May J. Cross-lingual multi-level adversarial transfer to enhance low-resource name tagging. In Proc. the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, June 2019, pp.3823-3833. DOI:10.18653/v1/N19-1383.
[30] Zeng D, Liu K, Lai S, Zhou G, Zhao J. Relation classification via convolutional deep neural network. In Proc. the 25th International Conference on Computational Linguistics:Technical Papers, August 2014, pp.2335-2344.
[31] Shi T, Liu Z, Liu Y, Sun M. Learning cross-lingual word embeddings via matrix co-factorization. In Proc. the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, July 2015, pp.567-572. DOI:10.3115/v1/P15-2093.
[32] Zeiler M D. ADADELTA:An adaptive learning rate method. arXiv:1212.5701, 2012. https://arxiv.org/abs/12-12.5701, Dec. 2019.
[33] Yin W, Kann K, Yu M, Schütze H. Comparative study of CNN and RNN for natural language processing. arXiv:1702.01923, 2017. https://arxiv.org/abs/1702.01923, Feb. 2020.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 周笛;. A Recovery Technique for Distributed Communicating Process Systems[J]. , 1986, 1(2): 34 -43 .
[2] 陈世华;. On the Structure of Finite Automata of Which M Is an(Weak)Inverse with Delay τ[J]. , 1986, 1(2): 54 -59 .
[3] 李万学;. Almost Optimal Dynamic 2-3 Trees[J]. , 1986, 1(2): 60 -71 .
[4] C.Y.Chung; 华宣仁;. A Chinese Information Processing System[J]. , 1986, 1(2): 15 -24 .
[5] 金兰; 杨元元;. A Modified Version of Chordal Ring[J]. , 1986, 1(3): 15 -32 .
[6] 章萃; 赵沁平; 徐家福;. Kernel Language KLND[J]. , 1986, 1(3): 65 -79 .
[7] 王建潮; 魏道政;. An Effective Test Generation Algorithm for Combinational Circuits[J]. , 1986, 1(4): 1 -16 .
[8] 陈肇雄; 高庆狮;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[9] 黄河燕;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[10] 郑国梁; 李辉;. The Design and Implementation of the Syntax-Directed Editor Generator(SEG)[J]. , 1986, 1(4): 39 -48 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: