Linguistic Theory Based Contextual Evidence Mining for Statistical Chinese Co-Reference Resolution
-
Abstract
Under statistical learning framework, the paper focuseson how to use traditional linguistic findings on anaphora resolution asa guide for mining and organizing contextual features for Chineseco-reference resolution. The main achievements are as follows. (1) Inorder to simulate ``syntactic and semantic parallelism factor'', weextract ``bags of word form and POS'' feature and ``bag of semes'' featurefrom the contexts of the entity mentions and incorporate them into thebaseline feature set. (2) Because it is too coarse to use the featureof bags of word form, POS tag and seme to determine the syntactic andsemantic parallelism between two entity mentions, we propose a methodfor contextual feature reconstruction based on semantic similaritycomputation, in order that the reconstructed contextual features couldbetter approximate the anaphora resolution factor of ``Syntactic andSemantic Parallelism Preferences''. (3) We use an entity-mention-basedcontextual feature representation instead of isolated word-basedcontextual feature representation, and expand the size of thecontextual windows in addition, in order to approximately simulate ``theselectional restriction factor'' for anaphora resolution. Theexperiments show that the multi-level contextual features are usefulfor co-reference resolution, and the statistical system incorporatedwith these features performs well on the standard ACE datasets.
-
-