? Emphasizing Essential Words for Sentiment Classification Based on Recurrent Neural Networks
Journal of Computer Science and Technology
Quick Search in JCST
 Advanced Search 
      Home | PrePrint | SiteMap | Contact Us | FAQ
 
Indexed by   SCIE, EI ...
Bimonthly    Since 1986
Journal of Computer Science and Technology 2017, Vol. 32 Issue (4) :785-795    DOI: 10.1007/s11390-017-1759-2
Special Issue on Deep Learning Current Issue | Archive | Adv Search << Previous Articles | Next Articles >>
Emphasizing Essential Words for Sentiment Classification Based on Recurrent Neural Networks
Fei Hu1,2, Student Member, CCF, Li Li1,*, Senior Member, CCF, Member, ACM, Zi-Li Zhang1, Distinguished Member, CCF, Member, ACM, Jing-Yuan Wang1, Student Member, CCF, Xiao-Fei Xu1, Student Member, CCF
1 College of Computer and Information Science, Southwest University, Chongqing 400715, China;
2 Network Centre, Chongqing University of Education, Chongqing 400065, China

Abstract
Reference
Related Articles
Download: [PDF 1122KB]     Export: BibTeX or EndNote (RIS)  
Abstract With the explosion of online communication and publication, texts become obtainable via forums, chat messages, blogs, book reviews and movie reviews. Usually, these texts are much short and noisy without sufficient statistical signals and enough information for a good semantic analysis. Traditional natural language processing methods such as Bow-of-Word (BOW) based probabilistic latent semantic models fail to achieve high performance due to the short text environment. Recent researches have focused on the correlations between words, i.e., term dependencies, which could be helpful for mining latent semantics hidden in short texts and help people to understand them. Long short-term memory (LSTM) network can capture term dependencies and is able to remember the information for long periods of time. LSTM has been widely used and has obtained promising results in variants of problems of understanding latent semantics of texts. At the same time, by analyzing the texts, we find that a number of keywords contribute greatly to the semantics of the texts. In this paper, we establish a keyword vocabulary and propose an LSTM-based model that is sensitive to the words in the vocabulary; hence, the keywords leverage the semantics of the full document. The proposed model is evaluated in a short-text sentiment analysis task on two datasets:IMDB and SemEval-2016, respectively. Experimental results demonstrate that our model outperforms the baseline LSTM by 1% 2% in terms of accuracy and is effective with significant performance enhancement over several non-recurrent neural network latent semantic models (especially in dealing with short texts). We also incorporate the idea into a variant of LSTM named the gated recurrent unit (GRU) model and achieve good performance, which proves that our method is general enough to improve different deep learning models.
Articles by authors
Keywordsshort text understanding   long short-term memory (LSTM)   gated recurrent unit (GRU)   sentiment classification   deep learning     
Received 2016-12-20;
Fund:

The work was supported by the Scientific and Technological Research Program of Chongqing Municipal Education Commission of China under Grant No. KJ1501405, the National Natural Science Foundation of China under Grant No. 61170192, and the Chongqing Science and Technology Commission (CSTC) under Grant No. cstc2015gjhz40002.

Corresponding Authors: Li Li     Email: lily@swu.edu.cn
About author:
Cite this article:   
Fei Hu, Li Li, Zi-Li Zhang, Jing-Yuan Wang, Xiao-Fei Xu.Emphasizing Essential Words for Sentiment Classification Based on Recurrent Neural Networks[J]  Journal of Computer Science and Technology, 2017,V32(4): 785-795
URL:  
http://jcst.ict.ac.cn:8080/jcst/EN/10.1007/s11390-017-1759-2
Copyright 2010 by Journal of Computer Science and Technology