›› 2016, Vol. 31 ›› Issue (3): 595-603.doi: 10.1007/s11390-016-1649-z

Special Issue: Artificial Intelligence and Pattern Recognition

• Data Management and Data Mining • Previous Articles     Next Articles

A Hybrid Method of Domain Lexicon Construction for Opinion Targets Extraction Using Syntax and Semantics

Chun Liao, Chong Feng*, Member, CCF, Sen Yang, and He-Yan Huang, Member, CCF   

  1. 1 Department of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China;
    2 Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Beijing Institute of Technology, Beijing 100081, China
  • Received:2014-11-17 Revised:2016-02-29 Online:2016-05-05 Published:2016-05-05
  • Contact: Chong Feng E-mail:fengchong@bit.edu.cn
  • Supported by:

    The work was supported by the National Basic Research 973 Program of China under Grant Nos. 2013CB329605 and 2013CB329303, and the National Natural Science Foundation of China under Grant No. 61201351.

Opinion targets extraction of Chinese microblogs plays an important role in opinion mining. There has been a significant progress in this area recently, especially the method based on conditional random field (CRF). However, this method only takes lexicon-related features into consideration and does not excavate the implied syntactic and semantic knowledge. We propose a novel approach which incorporates domain lexicon with groups of syntactical and semantic features. The approach acquires domain lexicon through a novel way which explores syntactic and semantic information through Partof-Speech, dependency structure, phrase structure, semantic role and semantic similarity based on word embedding. And then we combine the domain lexicon with opinion targets extracted from CRF with groups of features for opinion targets extraction. Experimental results on COAE2014 dataset show the outperformance of the approach compared with other well-known methods on the task of opinion targets extraction.

Full text



