Journal of Computer Science and Technology
Quick Search in JCST
 Advanced Search 
      Home | PrePrint | SiteMap | Contact Us | FAQ
Indexed by   SCIE, EI ...
Bimonthly    Since 1986
   ABOUT JCST            

Journal of Computer Science and Technology
2017 Vol.32 No.4
Published 2017-07-05

Special Issue on Deep Learning
Special Issue on Deep Learning
663 Hang Li, Xiang Bai, Xuan-Jing Huang, Chang-Shui Zhang
2017 Vol. 32 (4): 663-666 [Abstract] ( 123 ) [HTML 1KB] [PDF 137KB] ( 137 )
667 Shu-Chang Zhou, Yu-Zhi Wang, He Wen, Qin-Yao He, Yu-Heng Zou
Balanced Quantization: An Effective and Efficient Approach to Quantized Neural Networks
Quantized neural networks (QNNs), which use low bitwidth numbers for representing parameters and performing computations, have been proposed to reduce the computation complexity, storage size and memory usage. In QNNs, parameters and activations are uniformly quantized, such that the multiplications and additions can be accelerated by bitwise operations. However, distributions of parameters in neural networks are often imbalanced, such that the uniform quantization determined from extremal values may underutilize available bitwidth. In this paper, we propose a novel quantization method that can ensure the balance of distributions of quantized values. Our method first recursively partitions the parameters by percentiles into balanced bins, and then applies uniform quantization. We also introduce computationally cheaper approximations of percentiles to reduce the computation overhead introduced. Overall, our method improves the prediction accuracies of QNNs without introducing extra computation during inference, has negligible impact on training speed, and is applicable to both convolutional neural networks and recurrent neural networks. Experiments on standard datasets including ImageNet and Penn Treebank confirm the effectiveness of our method. On ImageNet, the top-5 error rate of our 4-bit quantized GoogLeNet model is 12.7%, which is superior to the state-of-the-arts of QNNs.
2017 Vol. 32 (4): 667-682 [Abstract] ( 38 ) [HTML 1KB] [PDF 461KB] ( 49 )
683 Xin-Yu Ou, Ping Li, He-Fei Ling, Si Liu, Tian-Jiang Wang, Dan Li
Objectness Region Enhancement Networks for Scene Parsing
Semantic segmentation has recently witnessed rapid progress, but existing methods only focus on identifying objects or instances. In this work, we aim to address the task of semantic understanding of scenes with deep learning. Different from many existing methods, our method focuses on putting forward some techniques to improve the existing algorithms, rather than to propose a whole new framework. Objectness enhancement is the first effective technique. It exploits the detection module to produce object region proposals with category probability, and these regions are used to weight the parsing feature map directly. "Extra background" category, as a specific category, is often attached to the category space for improving parsing result in semantic and instance segmentation tasks. In scene parsing tasks, extra background category is still beneficial to improve the model in training. However, some pixels may be assigned into this nonexistent category in inference. Black-hole filling technique is proposed to avoid the incorrect classification. For verifying these two techniques, we integrate them into a parsing framework for generating parsing result. We call this unified framework as Objectness Enhancement Network (OENet). Compared with previous work, our proposed OENet system effectively improves the performance over the original model on SceneParse150 scene parsing dataset, reaching 38.4 mIoU (mean intersectionover-union) and 77.9% accuracy in the validation set without assembling multiple models. Its effectiveness is also verified on the Cityscapes dataset.
2017 Vol. 32 (4): 683-700 [Abstract] ( 34 ) [HTML 1KB] [PDF 3951KB] ( 35 )
701 Xiang Bai, Zheng Zhang, Hong-Yang Wang, Wei Shen
Directional Edge Boxes: Exploiting Inner Normal Direction Cues for Effective Object Proposal Generation
Edges are important cues for localizing object proposals. The recent progresses to this problem are mostly driven by defining effective objectness measures based on edge cues. In this paper, we develop a new representation named directional edges on which each edge pixel is assigned with a direction toward object center, through learning a direction prediction model with convolutional neural networks in a holistic manner. Based on directional edges, two new objectness measures are designed for ranking object proposals. Experiments show that the proposed method achieves 97.1% object recall at an overlap threshold of 0.5 and 81.9% object recall at an overlap threshold of 0.7 at 1 000 proposals on the PASCAL VOC 2007 test dataset, which is superior to the state-of-the-art methods.
2017 Vol. 32 (4): 701-713 [Abstract] ( 37 ) [HTML 1KB] [PDF 6893KB] ( 25 )
714 Meng Chen, Lin-Lin Zhang, Xiaohui Yu, Yang Liu
Weighted Co-Training for Cross-Domain Image Sentiment Classification
Image sentiment classification, which aims to predict the polarities of sentiments conveyed by the images, has gained a lot of attention. Most existing methods address this problem by training a general classifier with certain visual features, ignoring the discrepancies across domains. In this paper, we propose a novel weighted co-training method for cross-domain image sentiment classification, which iteratively enlarges the labeled set by introducing new high-confidence classified samples to reduce the gap between the two domains. We train two sentiment classifiers with both the images and the corresponding textual comments separately, and set the similarity between the source domain and the target domain as the weight of a classifier. We perform extensive experiments on a real Flickr dataset to evaluate the proposed method, and the empirical study reveals that the weighted co-training method significantly outperforms some baseline solutions.
2017 Vol. 32 (4): 714-725 [Abstract] ( 40 ) [HTML 1KB] [PDF 1276KB] ( 31 )
726 Tian-Bi Jiang, Gui-Song Xia, Qi-Kai Lu, Wei-Ming Shen
Retrieving Aerial Scene Images with Learned Deep Image-Sketch Features
This paper investigates the problem of retrieving aerial scene images by using semantic sketches, since the state-of-the-art retrieval systems turn out to be invalid when there is no exemplar query aerial image available. However, due to the complex surface structures and huge variations of resolutions of aerial images, it is very challenging to retrieve aerial images with sketches and few studies have been devoted to this task. In this article, for the first time to our knowledge, we propose a framework to bridge the gap between sketches and aerial images. First, an aerial sketch-image database is collected, and the images and sketches it contains are augmented to various levels of details. We then train a multi-scale deep model by the new dataset. The fully-connected layers of the network in each scale are finally connected and used as cross-domain features, and the Euclidean distance is used to measure the cross-domain similarity between aerial images and sketches. Experiments on several commonly used aerial image datasets demonstrate the superiority of the proposed method compared with the traditional approaches.
2017 Vol. 32 (4): 726-737 [Abstract] ( 37 ) [HTML 1KB] [PDF 5756KB] ( 41 )
738 Ai-Wen Jiang, Bo Liu, Ming-Wen Wang
Deep Multimodal Reinforcement Network with Contextually Guided Recurrent Attention for Image Question Answering
Image question answering (IQA) has emerged as a promising interdisciplinary topic in computer vision and natural language processing fields. In this paper, we propose a contextually guided recurrent attention model for solving the IQA issues. It is a deep reinforcement learning based multimodal recurrent neural network. Based on compositional contextual information, it recurrently decides where to look using reinforcement learning strategy. Different from traditional "static" soft attention, it is deemed as a kind of "dynamic" attention whose objective is designed based on reinforcement rewards purposefully towards IQA. The finally learned compositional information incorporates both global context and local informative details, which is demonstrated to benefit for generating answers. The proposed method is compared with several state-of-the-art methods on two public IQA datasets, including COCO-QA and VQA from dataset MS COCO. The experimental results demonstrate that our proposed model outperforms those methods and achieves better performance.
2017 Vol. 32 (4): 738-748 [Abstract] ( 27 ) [HTML 1KB] [PDF 1924KB] ( 29 )
749 Lin-Er Yang, Mao-Song Sun, Yong Cheng, Jia-Cheng Zhang, Zheng-Hao Liu, Huan-Bo Luan, Yang Liu
Neural Parse Combination
Analyzing the syntactic structure of natural languages by parsing is an important task in artificial intelligence. Due to the complexity of natural languages, individual parsers tend to make different yet complementary errors. We propose a neural network based approach to combine parses from different parsers to yield a more accurate parse than individual ones. Unlike conventional approaches, our method directly transforms linearized candidate parses into the ground-truth parse. Experiments on the Penn English Treebank show that the proposed method improves over a state-of-the-art parser combination approach significantly.
2017 Vol. 32 (4): 749-757 [Abstract] ( 29 ) [HTML 1KB] [PDF 517KB] ( 27 )
758 Zhuo-Ran Liu, Yang Liu
Exploiting Unlabeled Data for Neural Grammatical Error Detection
Identifying and correcting grammatical errors in the text written by non-native writers have received increasing attention in recent years. Although a number of annotated corpora have been established to facilitate data-driven grammatical error detection and correction approaches, they are still limited in terms of quantity and coverage because human annotation is labor-intensive, time-consuming, and expensive. In this work, we propose to utilize unlabeled data to train neural network based grammatical error detection models. The basic idea is to cast error detection as a binary classification problem and derive positive and negative training examples from unlabeled data. We introduce an attention-based neural network to capture long-distance dependencies that influence the word being detected. Experiments show that the proposed approach significantly outperforms SVM and convolutional networks with fixed-size context window.
2017 Vol. 32 (4): 758-767 [Abstract] ( 35 ) [HTML 1KB] [PDF 377KB] ( 35 )
768 Ayana, Shi-Qi Shen, Yan-Kai Lin, Cun-Chao Tu, Yu Zhao, Zhi-Yuan Liu, Mao-Song Sun
Recent Advances on Neural Headline Generation
Recently, neural models have been proposed for headline generation by learning to map documents to headlines with recurrent neural network. In this work, we give a detailed introduction and comparison of existing work and recent improvements in neural headline generation, with particular attention on how encoders, decoders and neural model training strategies alter the overall performance of the headline generation system. Furthermore, we perform quantitative analysis of most existing neural headline generation systems and summarize several key factors that impact the performance of headline generation systems. Meanwhile, we carry on detailed error analysis to typical neural headline generation systems in order to gain more comprehension. Our results and conclusions are hoped to benefit future research studies.
2017 Vol. 32 (4): 768-784 [Abstract] ( 185 ) [HTML 1KB] [PDF 462KB] ( 108 )
785 Fei Hu, Li Li, Zi-Li Zhang, Jing-Yuan Wang, Xiao-Fei Xu
Emphasizing Essential Words for Sentiment Classification Based on Recurrent Neural Networks
With the explosion of online communication and publication, texts become obtainable via forums, chat messages, blogs, book reviews and movie reviews. Usually, these texts are much short and noisy without sufficient statistical signals and enough information for a good semantic analysis. Traditional natural language processing methods such as Bow-of-Word (BOW) based probabilistic latent semantic models fail to achieve high performance due to the short text environment. Recent researches have focused on the correlations between words, i.e., term dependencies, which could be helpful for mining latent semantics hidden in short texts and help people to understand them. Long short-term memory (LSTM) network can capture term dependencies and is able to remember the information for long periods of time. LSTM has been widely used and has obtained promising results in variants of problems of understanding latent semantics of texts. At the same time, by analyzing the texts, we find that a number of keywords contribute greatly to the semantics of the texts. In this paper, we establish a keyword vocabulary and propose an LSTM-based model that is sensitive to the words in the vocabulary; hence, the keywords leverage the semantics of the full document. The proposed model is evaluated in a short-text sentiment analysis task on two datasets:IMDB and SemEval-2016, respectively. Experimental results demonstrate that our model outperforms the baseline LSTM by 1% 2% in terms of accuracy and is effective with significant performance enhancement over several non-recurrent neural network latent semantic models (especially in dealing with short texts). We also incorporate the idea into a variant of LSTM named the gated recurrent unit (GRU) model and achieve good performance, which proves that our method is general enough to improve different deep learning models.
2017 Vol. 32 (4): 785-795 [Abstract] ( 38 ) [HTML 1KB] [PDF 1122KB] ( 31 )
796 Shi-Qi Shen, Yang Liu, Mao-Song
Optimizing Non-Decomposable Evaluation Metrics for Neural Machine Translation
While optimizing model parameters with respect to evaluation metrics has recently proven to benefit endto-end neural machine translation (NMT), the evaluation metrics used in the training are restricted to be defined at the sentence level to facilitate online learning algorithms. This is undesirable because the final evaluation metrics used in the testing phase are usually non-decomposable (i.e., they are defined at the corpus level and cannot be expressed as the sum of sentence-level metrics). To minimize the discrepancy between the training and the testing, we propose to extend the minimum risk training (MRT) algorithm to take non-decomposable corpus-level evaluation metrics into consideration while still keeping the advantages of online training. This can be done by calculating corpus-level evaluation metrics on a subset of training data at each step in online training. Experiments on Chinese-English and English-French translation show that our approach improves the correlation between training and testing and significantly outperforms the MRT algorithm using decomposable evaluation metrics.
2017 Vol. 32 (4): 796-804 [Abstract] ( 33 ) [HTML 1KB] [PDF 311KB] ( 31 )
805 Jun Yin, Wayne Xin Zhao, Xiao-Ming Li
Type-Aware Question Answering over Knowledge Base with Attention-Based Tree-Structured Neural Networks
Question answering (QA) over knowledge base (KB) aims to provide a structured answer from a knowledge base to a natural language question. In this task, a key step is how to represent and understand the natural language query. In this paper, we propose to use tree-structured neural networks constructed based on the constituency tree to model natural language queries. We identify an interesting observation in the constituency tree:different constituents have their own semantic characteristics and might be suitable to solve different subtasks in a QA system. Based on this point, we incorporate the type information as an auxiliary supervision signal to improve the QA performance. We call our approach type-aware QA. We jointly characterize both the answer and its answer type in a unified neural network model with the attention mechanism. Instead of simply using the root representation, we represent the query by combining the representations of different constituents using task-specific attention weights. Extensive experiments on public datasets have demonstrated the effectiveness of our proposed model. More specially, the learned attention weights are quite useful in understanding the query. The produced representations for intermediate nodes can be used for analyzing the effectiveness of components in a QA system.
2017 Vol. 32 (4): 805-813 [Abstract] ( 24 ) [HTML 1KB] [PDF 683KB] ( 59 )
814 Kai-Yuan Cui, Peng-Jie Ren, Zhu-Min Chen, Tao Lian, Jun Ma
Relation Enhanced Neural Model for Type Classification of Entity Mentions with a Fine-Grained Taxonomy
Inferring semantic types of the entity mentions in a sentence is a necessary yet challenging task. Most of existing methods employ a very coarse-grained type taxonomy, which is too general and not exact enough for many tasks. However, the performances of the methods drop sharply when we extend the type taxonomy to a fine-grained one with several hundreds of types. In this paper, we introduce a hybrid neural network model for type classification of entity mentions with a fine-grained taxonomy. There are four components in our model, namely, the entity mention component, the context component, the relation component, the already known type component, which are used to extract features from the target entity mention, context, relations and already known types of the entity mentions in surrounding context respectively. The learned features by the four components are concatenated and fed into a softmax layer to predict the type distribution. We carried out extensive experiments to evaluate our proposed model. Experimental results demonstrate that our model achieves state-of-the-art performance on the FIGER dataset. Moreover, we extracted larger datasets from Wikipedia and DBpedia. On the larger datasets, our model achieves the comparable performance to the state-of-the-art methods with the coarse-grained type taxonomy, but performs much better than those methods with the fine-grained type taxonomy in terms of micro-F1, macro-F1 and weighted-F1.
2017 Vol. 32 (4): 814-827 [Abstract] ( 40 ) [HTML 1KB] [PDF 1914KB] ( 43 )
828 Ting Bai, Hong-Jian Dou, Wayne Xin Zhao, Ding-Yi Yang, Ji-Rong Wen
An Experimental Study of Text Representation Methods for Cross-Site Purchase Preference Prediction Using the Social Text Data
Nowadays, many e-commerce websites allow users to login with their existing social networking accounts. When a new user comes to an e-commerce website, it is interesting to study whether the information from external social media platforms can be utilized to alleviate the cold-start problem. In this paper, we focus on a specific task on cross-site information sharing, i.e., leveraging the text posted by a user on the social media platform (termed as social text) to infer his/her purchase preference of product categories on an e-commerce platform. To solve the task, a key problem is how to effectively represent the social text in a way that its information can be utilized on the e-commerce platform. We study two major kinds of text representation methods for predicting cross-site purchase preference, including shallow textual features and deep textual features learned by deep neural network models. We conduct extensive experiments on a large linked dataset, and our experimental results indicate that it is promising to utilize the social text for predicting purchase preference. Specially, the deep neural network approach has shown a more powerful predictive ability when the number of categories becomes large.
2017 Vol. 32 (4): 828-842 [Abstract] ( 31 ) [HTML 1KB] [PDF 1638KB] ( 34 )
Journal of Computer Science and Technology
   ISSN 1000-9000
   CN 11-2296/TP
Journal Online
     Current Issue
     Online Before Printed
     Top Cited Papers
(Updated on Apr. 15, 2015)
Updated on
     Top 30 Most Read
     Top 30 Most Download

ScholarOne Manuscripts Log In

User ID:


Forgot your password?

Enter your e-mail address to receive an e-mail with your account information.

Copyright © 2006 JCST! All rights reserved Editorial Office
Journal of Computer Science and Technology Institute of Computing Technology Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Support by Beijing Magtech