Constructing Maximum Entropy Language Models for Movie Review Subjectivity Analysis
-
Abstract
Document subjectivity analysis has become an important aspect of webtext content mining. This problem is similar to traditional textcategorization, thus many related classification techniques can beadapted here. However, there is one significant difference that morelanguage or semantic information is required for better estimating thesubjectivity of a document. Therefore, in this paper, our focuses aremainly on two aspects. One is how to extract useful and meaningfullanguage features, and the other is how to construct appropriatelanguage models efficiently for this special task. For the first issue,we conduct a Global-Filtering and Local-Weighting strategy to select andevaluate language features in a series of n-grams with different ordersand within various distance-windows. For the second issue, we adoptMaximum Entropy (MaxEnt) modeling methods to construct our languagemodel framework. Besides the classical MaxEnt models, we have alsoconstructed two kinds of improved models with Gaussian and exponentialpriors respectively. Detailed experiments given in this paper show thatwith well selected and weighted language features, MaxEnt models withexponential priors are significantly more suitable for the textsubjectivity analysis task.
-
-