|
›› 2013,Vol. 28 ›› Issue (6): 1106-1116.doi: 10.1007/s11390-013-1401-x
所属专题: Artificial Intelligence and Pattern Recognition
• Special Section on Selected Paper from NPC 2011 • 上一篇 下一篇
Guo-Dong Zhou (周国栋), Senior Member, CCF, Member, ACM, IEEE, and Pei-Feng Li (李培峰), Member, CCF
Guo-Dong Zhou (周国栋), Senior Member, CCF, Member, ACM, IEEE, and Pei-Feng Li (李培峰), Member, CCF
本文从句法分析角度提出并探讨了汉语中被普遍忽略的缺省项恢复问题。首先,我们通过统计分析和初步实验验证了缺省项在汉语句法分析中的巨大作用和受益方式。然后,我们提出了两种缺省项恢复方法:联合成分分析和基于组块的依存分析。在汉语树库CTB 5.1上的实验表明,将缺省项恢复集成到Charniak句法分析器中能极大地提升该句法分析器的性能F1值1.29。据我们所知,这是首次在汉语句法分析中全面深入地探索缺省项问题,值得在未来研究中更多关注并加强探索。
[1] Marcus M P, Marcinkiewicz M A, Santorini B. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 1993, 19(2): 313-330.[2] Collins M. Head-driven statistical models for natural language parsing [Ph.D. Thesis]. University of Pennsylvania, 1999.[3] Charniak E. A maximum-entropy-inspired parser. In Proc. the 1st North American Chapter of the Association for Computational Linguistics Conference, April 2000, pp.132-139.[4] Petrov S, Klein D. Improved inference for unlexicalized parsing. In Proc. Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, April 2007, pp.404-411.[5] Zhao S H, Ng H T. Identification and resolution of Chinese zero pronouns: A machine learning approach. In Proc. the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 2007, pp.541-550.[6] Kong F, Zhou G D. A tree kernel-based unified framework for Chinese zero anaphora resolution. In Proc. the 2010 Conference on Empirical Methods in Natural Language Processing, October 2010, pp.882-891.[7] Kim Y J. Subject/object drop in the acquisition of Korean: A cross-linguistic comparison. Journal of East Asian Linguistics, 2000, 9(4): 325-351.[8] Chung T, Gildea D. Effects of empty categories on machine translation. In Proc. the 2010 Conference on Empirical Methods in Natural Language Processing, October 2010, pp.636-645.[9] Campbell R. Using linguistic principles to recover empty categories. In Proc. the 42nd Annual Meeting of the Association for Computational Linguistics, July 2004, pp.645-652.[10] Guo Y Q, Wang H F, van Genabith J. Recovering non-local dependencies for Chinese. In Proc. the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 2007, pp.257-266.[11] Bikel D M. On the parameter space of generative lexicalized statistical parsing models [Ph.D. Thesis]. University of Pennsylvania, 2004.[12] Johnson M. A simple pattern-matching algorithm for recovering empty nodes and their antecedents. In Proc. the 40th Annual Meeting of the Association for Computational Linguistics, July 2002, pp.136-143.[13] Dienes P, Dubey A. Antecedent recovery: Experiments with a trace tagger. In Proc. the 2003 Conference on Empirical Methods in Natural Language Processing, July 2003, pp.3340.[14] Dienes P, Dubey A. Deep syntactic processing by combining shallow methods. In Proc. the 41st Annual Meeting of the Association for Computational Linguistic, July 2003, pp.431438.[15] Yang Y Q, Xue N W. Chasing the ghost: Recovering empty categories in the Chinese TreeBank. In Proc. the 23rd International Conference on Computational Linguistics, August 2010, pp.1382-1390.[16] Xue N W, Yang Y Q. Dependency-based empty category detection via phrase structure trees. In Proc. the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, June 2013, pp.1051-1060.[17] Cai S, Chiang D, Goldbery Y. Languageindependent parsing with empty elements. In Proc. the 49th Annual Meeting of the Association for Computational Linguistics, June 2011, pp.212-216.[18] Cahill A, Burke M, O'Donovan R, van Genabith J, Way A. Long-distance dependency resolution in automatically acquired wide-coverage pcfg-based LFG approximations. In Proc. the 42nd Annual Meeting of the Association for Computational Linguistics, July 2004, pp.319-326.[19] Schmid H. Trace prediction and recovery with unlexicalized PCFGs and slash features. In Proc. the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, July 2006, pp.177-184.[20] Xue N W, Xia F. The bracketing guidelines for Penn Chinese Treebank project. Technical Report, IRCS 00-08, University of Pennsylvania.[21] Finkel R J, Manning D C. Joint parsing and named entity recognition. In Proc. the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, May 2009, pp.326-334.[22] Nivre J. An efficient algorithm for projective dependency parsing. In Proc. the 8th International Workshop on Parsing Technology, April 2003, pp.149-160.[23] Xue N W. Labeling Chinese predicates with semantic roles. Computational Linguistics, 2008, 34(2): 225-255.[24] Li J H, Zhou G D, Zhao H, Zhu Q M, Qian P D. Improving nominal SRL in Chinese language with verbal SRL information and automatic predicate recognition. In Proc. the 2009 Conference on Empirical Methods in Natural Language Processing, August 2009, pp.1280-1288.[25] Li J H, Zhou G D, Ng H T. Joint syntactic and semantic parsing of Chinese. In Proc. the 48th Annual Meeting of the Association for Computational Linguistics, July 2010, pp.11081117.[26] Cohen P R. Empirical Methods for Artificial Intelligence. Cambridge, USA: MIT Press, 1995.[27] Chen W L, Kazama J, Uchimoto K, Torisawa K. Improving dependency parsing with subtrees from auto-parsed data. In Proc. the 2009 Conference on Empirical Methods in Natural Language Processing, August 2009, pp.570-579.[28] Zhou G D, Kong F. Learning noun phrase anaphoricity in coreference resolution via label propagation. Journal of Computer Science and Technology, 2011, 26(1): 34-44.[29] Zhou G D, Zhu Q M. Kernel-based semantic relation detection and classification via enriched parse tree structure. Journal of Computer Science and Technology, 2011, 26(1): 45-56. |
No related articles found! |
版权所有 © 《计算机科学技术学报》编辑部 本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn 总访问量: |