›› 2012,Vol. 27 ›› Issue (3): 650-666.doi: 10.1007/s11390-012-1251-y

• • 上一篇    

一种运用规则类型依赖与复杂短语分析的词语情感极性分类方法

Luke Kien-Weng Tan1 (陈坚永), Jin-Cheon Na1 (罗镇川), Member, ACM, Yin-Leng Theng1 (邓燕玲), and Kuiyu Chang2 (张圭煜)   

  • 收稿日期:2011-09-01 修回日期:2012-01-19 出版日期:2012-05-05 发布日期:2012-05-05

Phrase-Level Sentiment Polarity Classification Using Rule-Based Typed Dependencies and Additional Complex Phrases Consideration

Luke Kien-Weng Tan1 (陈坚永), Jin-Cheon Na1 (罗镇川), Member, ACM, Yin-Leng Theng1 (邓燕玲), and Kuiyu Chang2 (张圭煜)   

  1. 1. Wee Kim Wee School of Communication and Information, Nanyang Technological University, 31 Nanyang Link 637718, Singapore;
    2. School of Computer Engineering, Nanyang Technological University, Block N4 Nanyang Avenue, 639798, Singapore
  • Received:2011-09-01 Revised:2012-01-19 Online:2012-05-05 Published:2012-05-05
  • About author:Luke Kien-Weng Tan received the B.Eng degree from National Uni-versity of Singapore (NUS), and the M.S. degree in information system from Nanyang Technological Univer-sity (NTU), Singapore. Currently, he is a Ph.D. candidate in the Wee Kim Wee School of Communication & In-formation at NTU.

随着Web 2.0的兴起,互联网用户发布的内容量越来越多.这些丰富的自由格式文本具有主观色彩的观点词语并足以影响人们的决策与行动.信息发布者由此对广大用户施加了一定的影响,例如,博客内容不仅可以影响人们的购买行为,也可以左右其对政治的看法以及理财规划等.一般来说,我们可以通过识别文本所体现的观点倾向来获取人们对于关注主题的态度及其对普通用户产生的影响.本文提出了一种通过挖掘极性规则以及分析词语间复杂关系来识别观点词语情感极性的自动方法.当前情感分析研究,在准确分析文本内容的语法和语义的条件下,着眼于通过使用类型依赖语法解析来处理词语间的功能性关系.典型做法是采用启发式方式来识别类型依赖极性模式,其不足之处在于无法检测到所有规则.我们在本文中提出使用分类序列规则CSR(Class Sequential Rules)来自动识别各种类型依赖模式,并将CSR与另一种启发式方法进行对比验证.实验结果显示,CSR的情感极性分类性能有显著的提高,在测试实例上得到80%的F1值.此外,研究还发现词语间的复杂关系足以影响情感极性分类性能,我们也进一步地探讨了处理词语间这类依赖关系的一些解决方案.

Abstract: The advent of Web 2.0 has led to an increase in user-generated content on the Web. This has provided an extensive collection of free-style texts with opinion expressions that could influence the decisions and actions of their readers. Providers of such content exert a certain level of influence on the receivers and this is evident from blog sites having effect on their readers' purchase decisions, political view points, financial planning, and others. By detecting the opinion expressed, we can identify the sentiments on the topics discussed and the influence exerted on the readers. In this paper, we introduce an automatic approach in deriving polarity pattern rules to detect sentiment polarity at the phrase level, and in addition consider the effects of the more complex relationships found between words in sentiment polarity classification. Recent sentiment analysis research has focused on the functional relations of words using typed dependency parsing, providing a refined analysis on the grammar and semantics of textual data. Heuristics are typically used to determine the typed dependency polarity patterns, which may not comprehensively identify all possible rules. We study the use of class sequential rules (CSRs) to automatically learn the typed dependency patterns, and benchmark the performance of CSR against a heuristic method. Preliminary results show CSR leads to further improvements in classification performance achieving over 80% F1 scores in the test cases. In addition, we observe more complex relationships between words that could influence phrase sentiment polarity, and further discuss on possible approaches to handle the effects of these complex relationships.

[1] Adar E, Adamic L A. Tracking information epidemics inblogspace. In Proc. Int. Conf. Web Intelligence, Wash-ington, DC, USA, Sept. 2005, pp.207-214.

[2] Agarwal N, Liu H, Tang L, Yu P S. Identifying the influentialbloggers in a community. In Proc. WSDM 2008, New York,USA, Feb. 2008, pp.207-218.

[3] Tan L K W, Na J C, Theng Y L. Influence detection be-tween blog posts through blog features, content analysis, andcommunity identity. Online Information Review, 2011, 35(3):425-442.

[4] Abbasi A, Chen H, Salem A. Sentiment analysis in multiplelanguages: Feature selection for opinion classification in Webforums. Trans. Inf. Syst., 2008, 26(3): Article No. 12.

[5] Demartini G, Siersdorfer S. Dear search engine: What's youropinion about: Sentiment analysis for semantic enrichmentof web search results. In Proc. SEMSEARCH 2010, NewYork, USA, April 2010, Article No.4.

[6] Devitt A, Ahmad K. Sentiment polarity identification in fi-nancial news: A cohesion-based approach. In Proc. ACL2007, Prague, Czech Republic, June 2007, pp.984-991.

[7] O'Hare N, Davy M, Bermingham A, Ferguson P, Sheridan P,Gurrin C, Smeaton A F. Topic-dependent sentiment analysisof financial blogs. In Proc. CIKM Workshop on TSA 2009,New York, USA, Nov. 2009, pp.9-16.

[8] Ding X, Liu B, Yu P S. A holistic lexicon-based approachto opinion mining. In Proc. WSDM 2008, New York, USA,April 2008, pp.231-240.

[9] Morinaga S, Yamanishi K, Tateishi K, Fukushima T. Miningproduct reputations on the Web. In Proc. SIGKDD 2002,New York, USA, July 2002, pp.341-349.

[10] Riloff E, Wiebe J. Learning extraction patterns for subjectiveexpressions. In Proc. EMNLP 2003, Stroudsburg, PA, USA,July 2003, pp.105-112.

[11] Turney P D. Thumbs up or thumbs down?: Semantic orienta-tion applied to unsupervised classification of reviews. In Proc.ACL 2002, Stroudsburg, PA, USA, July 2002, pp.417-424.

[12] Pang B, Lee L. A sentimental education: Sentiment analysisusing subjectivity summarization based on minimum cuts. InProc. ACL 2004, Barcelona, Spain, July 2004, pp.271-278.

[13] Thet T T, Na J C, Khoo C S G. Aspect-based sentimentanalysis of movie reviews on discussion boards. Journal ofInformation Science, 2010, 36(6): 823-848.

[14] Wilson T, Wiebe J, Hoffmann P. Recognizing contextual po-larity in phrase-level sentiment analysis. In Proc. HLT-EMNLP 2005, Vancouver, British Columbia, Canada, Oct.2005, pp.347-354.

[15] Wilson T, Wiebe J, Hwa R. Recognizing strong and weakopinion clauses. Computational Intelligence, 2006, 22(2): 73-99.

[16] Nivre J. Dependency grammar and dependency parsing.Technical Report MSI report 05133, V?axjö University, Schoolof Mathematics and Systems Engineering, 2005.

[17] Jakob N, Weber S H, Muller M C, Gurevych I. Beyond thestars: Exploiting free-text user reviews to improve the accu-racy of movie recommendations. In Proc. CIKM Workshopon TSA 2009, Hong Kong, China, Nov. 2009, pp.57-64.

[18] Shaikh M A M, Prendinger H, Ishizuka M. Sentiment assess-ment of text by analyzing linguistic features and contextualvalence assignment. Appl. Artif. Intell., 2008, 22(6): 558-601.

[19] Liu B. Web Data Mining: Exploring Hyperlinks, Contentsand Usage Data (1st edition). Springer Berlin Heidelberg,New York, 2006, pp.37-54.

[20] Osman D J, Yearwood J, Vamplew P. Weblogs for market re-search: Finding more relevant opinion documents using sys-tem fusion. Online Information Review, 2009, 33(5): 873-888.

[21] Hu M, Liu B. Mining and summarizing customer reviews. InProc. the 10th SIGKDD, Seattle, WA, USA, Aug. 2004,pp.168-177.

[22] Kim S M, Hovy E. Determining the sentiment of opinions. InProc. the 20th COLING, Geneva, Switzerland, 2004, pp.1367-1373.

[23] Zhang C, Zeng D, Li J, Wang F Y, Zuo W. Sentiment analy-sis of Chinese Documents: From sentence to document level.Journal of the American Society for Information Science andTechnology, 2009, 60(12): 2474-2487.

[24] Na J C, Thet T T, Khoo C. Comparing sentiment expressionin movie reviews from four online genres. Online InformationReview, 2010, 34(2): 317-338.

[25] Moilanen K, Pulman S. Sentiment composition. In Proc.RANLP 2007, Borovets, Bulgaria, Sept. 2007, pp.378-382.

[26] Cohen J. A coefficient of agreement for nominal scales. Edu-cational and Psychological Measurement, 1960, 20(1): 37-46.

[27] Joshi M, Penstein-Rose C. Generalizing dependency featuresfor opinion mining. In Proc. ACL-IJCNLP 2009, Suntec,Singapore, Aug. 2009, pp.313-316.

[28] Agrawal R, Srikant R. Fast algorithms for mining associationrules in large databases. In Proc. VLDB 1994, Santiago deChile, Chile, Sept. 1994, pp.487-499.

[29] Wong K W, Zhou S, Yang Q, Yeung J M S. Mining customervalue: From association rules to direct marketing. Data Min-ing and Knowledge Discovery, 2005, 11(1): 57-79.

[30] Polanyi L, Zaenen A. Computing attitude and affect in text:Theory and applications. Computing Attitude and Affect inText: Theory and Applications, 2006, 20: 1-10.

[31] Quirk R, Greenbaum S, Leech G, Svartvik J. A Comprehen-sive Grammar of the English Language, Longman, 1985.

[32] Tan L K W, Na J C, Theng Y L, Chang K Y. Sentence-levelsentiment polarity classification using a linguistic approach.In Proc. ICADL 2011, Beijing, China, Oct. 2011, pp.77-87.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 冯玉琳;. Hierarchical Protocol Analysis by Temporal Logic[J]. , 1988, 3(1): 56 -69 .
[2] 沈理;. Testability Analysis at Switch Level for CMOS Circuits[J]. , 1990, 5(2): 197 -202 .
[3] 郑崇勋; 张克农;. Orthogonal Algorithm of Logic Probability and Syndrome-Testable Analysis[J]. , 1990, 5(2): 203 -209 .
[4] 韩建超; 史忠植;. Formalizing Default Reasoning[J]. , 1990, 5(4): 374 -378 .
[5] 黄志毅; 胡守仁;. Detection of And-Parallelism in Logic Programs[J]. , 1990, 5(4): 379 -387 .
[6] 周哈阳;. Analogical Learning and Automated Rule Constructions[J]. , 1991, 6(4): 316 -328 .
[7] 姚新; 李国杰;. General Simulated Annealing[J]. , 1991, 6(4): 329 -338 .
[8] 李卫华; 袁由光;. Error Recovery in a Real-Time Multiprocessor System[J]. , 1992, 7(1): 83 -87 .
[9] 吴信东;. Inductive Learning[J]. , 1993, 8(2): 22 -36 .
[10] 谭建荣; 郑建民; 彭群生;. A Unified Algorithm for Finding the Intersection Curve of Surfaces[J]. , 1994, 9(2): 107 -116 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: