We use cookies to improve your experience with our site.

基于缺省项恢复的汉语句法分析改进研究

Improving Syntactic Parsing of Chinese with Empty Element Recovery

  • 摘要: 本文从句法分析角度提出并探讨了汉语中被普遍忽略的缺省项恢复问题。首先,我们通过统计分析和初步实验验证了缺省项在汉语句法分析中的巨大作用和受益方式。然后,我们提出了两种缺省项恢复方法:联合成分分析和基于组块的依存分析。在汉语树库CTB 5.1上的实验表明,将缺省项恢复集成到Charniak句法分析器中能极大地提升该句法分析器的性能F1值1.29。据我们所知,这是首次在汉语句法分析中全面深入地探索缺省项问题,值得在未来研究中更多关注并加强探索。

     

    Abstract: This paper puts forward and explores the problem of empty element (EE) recovery in Chinese from the syntactic parsing perspective, which has been largely ignored in the literature. First, we demonstrate why EEs play a critical role in syntactic parsing of Chinese and how EEs can better benefit syntactic parsing of Chinese via re-categorization from the syntactic perspective. Then, we propose two ways to automatically recover EEs: a joint constituent parsing approach and a chunk-based dependency parsing approach. Evaluation on the Chinese TreeBank (CTB) 5.1 corpus shows that integrating EE recovery into the Charniak parser achieves a significant performance improvement of 1.29 in F1-measure. To the best of our knowledge, this is the first close examination of EEs in syntactic parsing of Chinese, which deserves more attention in the future with regard to its specific importance.

     

/

返回文章
返回