Abstract Inferring semantic types of the entity mentions in a sentence is a necessary yet challenging task. Most of existing methods employ a very coarse-grained type taxonomy, which is too general and not exact enough for many tasks. However, the performances of the methods drop sharply when we extend the type taxonomy to a fine-grained one with several hundreds of types. In this paper, we introduce a hybrid neural network model for type classification of entity mentions with a fine-grained taxonomy. There are four components in our model, namely, the entity mention component, the context component, the relation component, the already known type component, which are used to extract features from the target entity mention, context, relations and already known types of the entity mentions in surrounding context respectively. The learned features by the four components are concatenated and fed into a softmax layer to predict the type distribution. We carried out extensive experiments to evaluate our proposed model. Experimental results demonstrate that our model achieves state-of-the-art performance on the FIGER dataset. Moreover, we extracted larger datasets from Wikipedia and DBpedia. On the larger datasets, our model achieves the comparable performance to the state-of-the-art methods with the coarse-grained type taxonomy, but performs much better than those methods with the fine-grained type taxonomy in terms of micro-F1, macro-F1 and weighted-F1.
This work is supported by the National Natural Science Foundation of China under Grant Nos. 61672322 and 61672324, the Natural Science Foundation of Shandong Province of China under Grant No. 2016ZRE27468, and the Fundamental Research Funds of Shandong University.
Cite this article:
Kai-Yuan Cui, Peng-Jie Ren, Zhu-Min Chen, Tao Lian, Jun Ma.Relation Enhanced Neural Model for Type Classification of Entity Mentions with a Fine-Grained Taxonomy[J] Journal of Computer Science and Technology, 2017,V32(4): 814-827
 Berant J, Chou A, Frostig R, Liang P. Semantic parsing on Freebase from question-answer pairs. In Proc. EMNLP, Oct. 2013, pp.1533-1544. Mollá D, van Zaanen M, Cassidy S. Named entity recognition in question answering of speech data. In Proc. the Australasian Language Technology Workshop, Dec. 2007, pp.57-65. Leidner J L, Sinclair G, Webber B. Grounding spatial named entities for information extraction and question answering. In Proc. the HLT-NAACL Workshop on Analysis of Geographic References, May 2003, pp.31-38. Lee C, Hwang Y G, Oh H J, Lim S, Heo J, Lee C H, Kim H J,Wang J H, Jang M G. Fine-grained named entity recognition using conditional random fields for question answering. In Lecture Notes in Computer Science 4182, Ng H T, Leong M K, Kan M Y et al. (eds.), Springer Berlin Heidelberg, 2006, pp.581-587. Rahman A, Ng V. Inducing fine-grained semantic classes via hierarchical and collective classification. In Proc. the 23rd International Conference on Computational Linguistics, Aug. 2010, pp.931-939. Yosef M A, Bauer S, Hoffart J, Spaniol M, Weikum G. HYENA:Hierarchical type classification for entity names. In Proc. COLING, Dec. 2012, pp.1361-1370. Humphrey E J, Bello J P, LeCun Y. Feature learning and deep architectures:New directions for music informatics. Journal of Intelligent Information Systems, 2013, 41(3):461-481. Mnih V, Heess N, Graves A. Recurrent models of visual attention. In Proc. NIPS, Dec. 2014, pp.2204-2212. Chen D, Manning C D. A fast and accurate dependency parser using neural networks. In Proc. EMNLP, Oct. 2014, pp.740-750. Dong L, Wei F, Sun H, Zhou M, Xu K. A hybrid neural model for type classification of entity mentions. In Proc. IJCAI, Jul. 2015, pp.1243-1249. Ling X, Weld D S. Fine-grained entity recognition. In Proc. AAAI, Jul. 2012. Yogatama D, Gillick D, Lazic N. Embedding methods for fine grained entity type classification. In Proc. ACL, Jul. 2015, pp.291-296. Cimiano P, Völker J. Towards large-scale, open-domain and ontology-based named entity classification. In Proc. the International Conference on Recent Advances in Natural Language Processing (RANLP), Sept. 2005, pp.166-172. Collins M, Singer Y. Unsupervised models for named entity classification. In Proc. the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, Jun. 1999, pp.100-110. Nothman J, Ringland N, Radford W, Murphy T, Curran J R. Learning multilingual named entity recognition from Wikipedia. Artificial Intelligence, 2013, 194:151-175. Ritter A, Clark S, Mausam, Etzioni O. Named entity recognition in tweets:An experimental study. In Proc. EMNLP, Jul. 2011, pp.1524-1534. Ratinov L, Roth D, Downey D, Anderson M. Local and global algorithms for disambiguation toWikipedia. In Proc. the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies, Jun. 2011, pp.1375-1384. Han X, Sun L, Zhao J. Collective entity linking in Web text:A graph-based method. In Proc. the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, Jul. 2011, pp.765-774. Ferragina P, Scaiella U. TAGME:On-the-fly annotation of short text fragments (by Wikipedia entities). In Proc. the 19th ACM International Conference on Information and Knowledge Management, Oct. 2010, pp.1625-1628. Pennington J, Socher R, Manning C D. GloVe:Global vectors for word representation. In Proc. EMNLP, Oct. 2014, pp.1532-1543. Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv:1301.3781, 2013. https://arxiv.org/abs/1301.3781, May 2017. Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P. Natural language processing (almost) from scratch. Journal of Machine Learning Research, 2011, 12:2493-2537. Mikolov T, Karafiát M, Burget L, Cernocký J, Khudanpur S. Recurrent neural network based language model. In Proc. INTERSPEECH, Sept. 2010. Elman J L. Finding structure in time. Cognitive Science, 1990, 14(2):179-211. Shimaoka S, Stenetorp P, Inui K, Riedel S. An attentive neural architecture for fine-grained entity type classification. In Proc. NAACL-HLT, Jun. 2016, pp.69-74. Zeng D, Liu K, Lai S, Zhou G, Zhao J. Relation classification via convolutional deep neural network. In Proc. COLING, Aug. 2014, pp.2335-2344. Hashimoto K, Miwa M, Tsuruoka Y, Chikayama T. Simple customization of recursive neural networks for semantic relation classification. In Proc. EMNLP, Oct. 2013, pp.1372-1376. Tjong Kim Sang E F, De Meulder F. Introduction to the CoNLL-2003 shared task:Language-independent named entity recognition. In Proc. the 17th Conference on Natural Language Learning at HLT-NAACL, May 2003, pp.142-147. Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J. Freebase:A collaboratively created graph database for structuring human knowledge. In Proc. the ACM SIGMOD International Conference on Management of Data, Jun. 2008, pp.1247-1250. Fader A, Soderland S, Etzioni O. Identifying relations for open information extraction. In Proc. EMNLP, Jul. 2011, pp.1535-1545. De Marneffe M C, MacCartney B, Manning C D. Generating typed dependency parses from phrase structure parses. In Proc. LREC, May 2006, pp.449-454. Brown P F, Desouza P V, Mercer R L, Pietra V J, Lai J C. Class-based n-gram models of natural language. Computational Linguistics, 1992, 18(4):467-479. Fellbaum C. WordNet:An Electronic Lexical Database. Cambridge, MA:MIT Press, 1998. Suchanek F M, Kasneci G, Weikum G. YAGO:A core of semantic knowledge. In Proc. the 16th International Conference on World Wide Web, May 2007, pp.697-706. Socher R, Chen D, Manning C D, Ng A. Reasoning with neural tensor networks for knowledge base completion. In Proc. NIPS, Dec. 2013, pp.926-934. Socher R, Manning C D, Ng A Y. Learning continuous phrase representations and syntactic parsing with recursive neural networks. In Proc. the NIPS-2010 Deep Learning and Unsupervised Feature Learning Workshop, Dec. 2010. Yin W, Schütze H, Xiang B, Zhou B. ABCNN:Attentionbased convolutional neural network for modeling sentence pairs. Transactions of the Association for Computational Linguistics, 2016, 4:259-272. dos Santos C, Tan M, Xiang B, Zhou B. Attentive pooling networks. arXiv:1602.03609v1, 2016, https://arxiv.org/pdf/1602.03609v1.pdf, May 2017. Bengio Y. Practical recommendations for gradient-based training of deep architectures. In Neural Networks:Tricks of the Trade, Montavon G, Orr G B, Müller K R (eds.), Springer Berlin Heidelberg, 2012, pp.437-478. Nastase V, Strube M, Börschinger B, Zirn C, Elghafari A. WikiNet:A very large scale multi-lingual concept network. In Proc. LREC, May 2010. Fan R E, Chang K W, Hsieh C J, Wang X R, Lin C J. LIBLINEAR:A library for large linear classification. The Journal of Machine Learning Research, 2008, 9(8):1871-1874. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8):1735-1780. Graves A. Supervised sequence labelling. In Studies in Computational Intelligence, Kacprzyk J (ed.), Springer Berlin Heidelberg, 2012, pp.5-13.