|
›› 2015,Vol. 30 ›› Issue (4): 903-916.doi: 10.1007/s11390-015-1569-3
所属专题: Artificial Intelligence and Pattern Recognition; Data Management and Data Mining
• Special Section on Selected Paper from NPC 2011 • 上一篇 下一篇
Lei Fang(房磊), Biao Liu(刘 彪), Min-Lie Huang*(黄民烈), Member, CCF
Lei Fang(房磊), Biao Liu(刘 彪), Min-Lie Huang*(黄民烈), Member, CCF
特征词和情感词的抽取在情感分析领域是一项比较重要工作。在本文中,我们提出了一种基于大量未标注的评论数据,仅使用少数的特征词-情感词的搭配作为先验知识,来抽取语料中的特征词和情感词。 我们的主要贡献有两个方面:第一,我们提出了一种数据驱动的表示方法来表示特征词和评价词在语料级别上的关系,这种表示方法能够灵活的刻画丰富的语言结构;第二,我们使用了简单的、引入先验知识的无监督学习模型来进行特征词和情感词的抽取,并且在抽取的过程中一定程度上减少了错误传播的可能性。实验的结果表明我们提出的方法对于特征词和情感词抽取这个任务来说是十分有效的。
[1] Ante S E. Amazon: Turning consumer opinions into gold. Business Week. http://www.bloomberg.com/bw/magazine/content/0943/b4152047039565.htm, May 2015.[2] Pang B, Lee L, Vaithyanathan S. Thumbs up?: Sentiment classification using machine learning techniques. In Proc. the ACL-02 Conference on Empirical Methods in Natural Language Processing, Jul. 2002, pp.79-86.[3] Hu M, Liu B. Mining and summarizing customer reviews. In Proc. the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2004, pp. 168-177.[4] Liu B, Hu M, Cheng J. Opinion observer: Analyzing and comparing opinions on the web. In Proc. the 14th International Conference on World Wide Web, May 2005, pp.342- 351.[5] Qiu G, Liu B, Bu J, Chen C. Opinion word expansion and target extraction through double propagation. Comput. Linguist., 2011, 37(1): 9-27.[6] Zhuang L, Jing F, Zhu X Y. Movie review mining and summarization. In Proc. the 15th ACM International Conference on Information and Knowledge Management, Nov. 2006, pp.43-50.[7] Hai Z, Chang K, Cong G. One seed to find them all: Mining opinion features via association. In Proc. the 21st ACM International Conference on Information and Knowledge Management, Oct. 29 – Nov. 2, 2012, pp.255-264.[8] Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993-1022.[9] Titov I, McDonald R. A joint model of text and aspect ratings for sentiment summarization. In Proc. the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Jun. 2008, pp.308-316.[10] Zhao W X, Jiang J, Yan H, Li X. Jointly modeling aspects and opinions with a Maxent-LDA hybrid. In Proc. the 2010 Conference on Empirical Methods in Natural Language Processing, Oct. 2010, pp.56-65.[11] Mukherjee A, Liu B. Aspect extraction through semisupervised modeling. In Proc. the 50th Annual Meeting of the Association for Computational Linguistics, Jul. 2012, pp.339-348.[12] Newman D, Asuncion A, Smyth P, Welling M. Distributed algorithms for topic models. Journal of Machine Learning Research, 2009, 10: 1801-1828.[13] Lin J, Kolcz A. Large-scale machine learning at Twitter. In Proc. the 2012 ACM SIGMOD International Conference on Management of Data, May 2012, pp.793-804.[14] Halevy A, Norvig P, Pereira F. The unreasonable effectiveness of data. IEEE Intelligent Systems, 2009, 24(2): 8-12.[15] Kobayashi N, Inui K, Matsumoto Y. Extracting aspectevaluation and aspect-of relations in opinion mining. In Proc. the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jun. 2007, pp.1065-1074.[16] Wu Y, Zhang Q, Huang X, Wu L. Phrase dependency parsing for opinion mining. In Proc. the 2009 Conference on Empirical Methods in Natural Language Processing, Aug. 2009, pp.1533-1541.[17] Li F, Han C, Huang M, Zhu X, Xia Y J, Zhang S, Yu H. Structure-aware review mining and summarization. In Proc. the 23rd International Conference on Computational Linguistics, Aug. 2010, pp.653-661.[18] Choi Y, Cardie C. Hierarchical sequential learning for extracting opinions and their attributes. In Proc. the ACL 2010 Conference Short Papers, Jul. 2010, pp.269-274.[19] Popescu A M, Etzioni O. Extracting product features and opinions from reviews. In Proc. the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, Oct. 2005, pp.339-346.[20] Kaji N, Kitsuregawa M. Building lexicon for sentiment analysis from massive collection of HTML documents. In Proc. the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 2007, pp.1075-1083.[21] Guo H, Zhu H, Guo Z, Zhang X, Su Z. Product feature categorization with multilevel latent semantic association. In Proc. the 18th ACM Conference on Information and Knowledge Management, Nov. 2009, pp.1087-1096.[22] Zhang L, Liu B, Lim S H, O'Brien-Strain E. Extracting and ranking product features in opinion documents. In Proc. the 23rd International Conference on Computational Linguistics, Aug. 2010, pp.1462-1470.[23] Gindl S, Weichselbraun A, Scharl A. Rule-based opinion target and aspect extraction to acquire affective knowledge. In Proc. the 22nd International Conference on World Wide Web Companion, May 2013, pp.557-564.[24] Mei Q, Ling X, Wondra M, Su H, Zhai C. Topic sentiment mixture: Modeling facets and opinions in weblogs. In Proc. the 16th International Conference on World Wide Web, May 2007, pp.171-180.[25] Brody S, Elhadad N. An unsupervised aspect-sentiment model for online reviews. In Proc. Human Language Technologies: The 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics, Jun. 2010, pp.804-812.[26] Jo Y, Oh A H. Aspect and sentiment unification model for online review analysis. In Proc. the 4th ACM International Conference on Web Search and Data Mining, Feb. 2011, pp.815-824.[27] Lu B, Ott M, Cardie C, Tsou B K. Multi-aspect sentiment analysis with topic models. In Proc. the 11th IEEE International Conference on Data Mining Workshops, Dec. 2011, pp.81-88.[28] Moghaddam S, Ester M. ILDA: Interdependent LDA model for learning latent aspects and their ratings from online product reviews. In Proc. the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, Jul. 2011, pp.665-674.[29] Chen Z, Mukherjee A, Liu B, Hsu M, Castellanos M, Ghosh R. Exploiting domain knowledge in aspect extraction. In Proc. the 2013 Conference on Empirical Methods in Natural Language Processing, Oct. 2013, pp.1655-1667.[30] Wang H, Lu Y, Zhai C. Latent aspect rating analysis on review text data: A rating regression approach. In Proc. the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Jul. 2010, pp.783-792.[31] Snyder B, Barzilay R. Multiple aspect ranking using the good grief algorithm. In Proc. Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics, Apr. 2007, pp.300-307.[32] Yu J, Zha Z J, Wang M, Chua T S. Aspect ranking: Identifying important product aspects from online consumer reviews. In Proc. the 49th Annual Meeting of the Association for Computational Linguistics, Jun. 2011, pp.1496-1505.[33] Li P, Wang Y, Gao W, Jiang J. Generating aspect-oriented multi-document summarization with event-aspect model. In Proc. the Conference on Empirical Methods in Natural Language Processing, Jul. 2011, pp.1137-1146.[34] Liu K, Xu L, Zhao J. Opinion target extraction using wordbased translation model. In Proc. the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jul. 2012, pp.1346-1356.[35] Liu K, Xu L, Zhao J. Syntactic patterns versus word alignment: Extracting opinion targets from online reviews. In Proc. the 51st Annual Meeting of the Association for Computational Linguistics, Aug. 2013, pp.1754-1763.[36] Xu L, Liu K, Lai S, Chen Y, Zhao J. Mining opinion words and opinion targets in a two-stage framework. In Proc. the 51st Annual Meeting of the Association for Computational Linguistics, Aug. 2013, pp.1764-1773.[37] Andrzejewski D, Zhu X, Craven M. Incorporating domain knowledge into topic modeling via dirichlet forest priors. In Proc. the 26th Annual International Conference on Machine Learning, Jun. 2009, pp.25-32.[38] Andrzejewski D, Zhu X, Craven M, Recht B. A framework for incorporating general domain knowledge into latent dirichlet allocation using first-order logic. In Proc. the 22nd International Joint Conference on Artificial Intelligence, Jul. 2011, pp.1171-1177.[39] Li T, Zhang Y, Sindhwani V. A non-negative matrix trifactorization approach to sentiment classification with lexical prior knowledge. In Proc. the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Aug. 2009, pp.244-252.[40] Shen C, Li T. A non-negative matrix factorization based approach for active dual supervision from document and word labels. In Proc. the Conference on Empirical Methods in Natural Language Processing, Jul. 2011, pp.949-958.[41] Fang L, Huang M, Zhu X. Exploring weakly supervised latent sentiment explanations for aspect-level review analysis. In Proc. the 22nd ACM International Conference on Information and Knowledge Management, Oct. 27 – Nov. 1, 2013, pp.1057-1066.[42] Yu C N J, Joachims T. Learning structural SVMs with latent variables. In Proc. the 26th Annual International Conference on Machine Learning, Jun. 2009, pp.1169-1176.[43] Druck G, Mann G, McCallum A. Learning from labeled features using generalized expectation criteria. In Proc. the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Jul. 2008, pp.595-602.[44] Ganchev K, Graça J, Gillenwater J, Taskar B. Posterior regularization for structured latent variable models. Journal of Machine Learning Research, 2010, 11: 2001-2049.[45] Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. Communications of the ACM, 2008, 51(1): 107-113.[46] Klein D, Manning C D. Accurate unlexicalized parsing. In Proc. the 41st Annual Meeting on Association for Computational Linguistics, Jul. 2003, pp.423-430. |
No related articles found! |
|
版权所有 © 《计算机科学技术学报》编辑部 本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn 总访问量: |