Incorporating Linguistic Structure into MaximumEntropy Language Models
-
Abstract
In statistical language models, how to integrate diverse linguistic knowledge in ageneral framework for long-distance dependencies is a challenging issue. In this paper, an improved language model incorporating linguistic structure into maximum entropy framework is presented.The proposed model combines trigram with the structure knowledge of base phrase in which trigram is used to capture the local relation between words, while the structure knowledge of base phrase is considered to represent the long-distance relations between syntactical structures. The knowledge of syntax, semantics and vocabulary is integrated into the maximum entropy framework.Experimental results show that the proposed model improves by 24% for language model perplexity and increases about 3% for sign language recognition rate compared with the trigram model.
-
-