Bimonthly    Since 1986
ISSN 1000-9000(Print)
CN 11-2296/TP
Indexed in:
Publication Details
Edited by: Editorial Board of Journal Of Computer Science and Technology
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Distributed by:
China: All Local Post Offices
Other Countries: Springer
  • Table of Content
      05 May 2017, Volume 32 Issue 3 Previous Issue    Next Issue
    For Selected: View Abstracts Toggle Thumbnails
    Special Section of CVM 2017
    Shi-Min Hu, Niloy J. Mitra, Yizhou Yu
    Journal of Computer Science and Technology, 2017, 32 (3): 415-416.  DOI: 10.1007/s11390-017-1732-0
    Abstract   PDF(102KB) ( 593 )   Chinese Summary
    Related Articles | Metrics
    Photographic Appearance Enhancement via Detail-based Dictionary Learning
    Zhi-Feng Xie, Shi Tang, Dong-Jin Huang, You-Dong Ding, Li-Zhuang Ma
    Journal of Computer Science and Technology, 2017, 32 (3): 417-429.  DOI: 10.1007/s11390-017-1733-z
    Abstract   PDF(6292KB) ( 1274 )   Chinese Summary
    A number of edge-aware filters can efficiently boost the appearance of an image by detail decomposition and enhancement. However, they often fail to produce photographic enhanced appearance due to some visible artifacts, especially noise, halos and unnatural contrast. The essential reason is that the guidance and constraint of high-quality appearance aren't enough sufficient in the process of enhancement. Thus our idea is to train a detail dictionary from a lot of high-quality patches in order to constrain and control the entire appearance enhancement. In this paper, we propose a novel learning-based enhancement method for photographic appearance, which includes two main stages: dictionary training and sparse reconstruction. In the training stage, we construct a training set of detail patches by extracting from some high-quality photos, and then train an overcomplete detail dictionary by iteratively minimizing an L1-norm energy function. In the reconstruction stage, we employ the trained dictionary to reconstruct the boosted detail layer, and further formalize a gradient-guided optimization function to improve the local coherence between patches. Moreover, we propose two evaluation metrics to measure the performance of appearance enhancement. The final experimental results have demonstrated the effectiveness of our learning-based enhancement method.
    References | Related Articles | Metrics
    Static Scene Illumination Estimation from Video with Applications
    Bin Liu, Kun Xu, Ralph R. Martin
    Journal of Computer Science and Technology, 2017, 32 (3): 430-442.  DOI: 10.1007/s11390-017-1734-y
    Abstract   PDF(6178KB) ( 1347 )   Chinese Summary
    We present a system that automatically recovers scene geometry and illumination from a video, providing a basis for various applications. Previous image based illumination estimation methods either require user interaction or external information in the form of a database. We adopt structure-from-motion and multi-view stereo for initial scene reconstruction, and then estimate an environment map represented by spherical harmonics (as these perform better than other bases). We also demonstrate several video editing applications that exploit the recovered geometry and illumination, including object insertion (e.g. for augmented reality), shadow detection, and video relighting.
    References | Related Articles | Metrics
    Temporally Consistent Depth Map Prediction Using Deep CNN and Spatial-temporal Conditional Random Field
    Xu-Ran Zhao, Xun Wang, Qi-Chao Chen
    Journal of Computer Science and Technology, 2017, 32 (3): 443-456.  DOI: 10.1007/s11390-017-1735-x
    Abstract   PDF(3384KB) ( 2229 )   Chinese Summary
    Deep convolutional neural networks (DCNN) based methods recently keep setting new records on tasks of predicting depth maps from monocular images. When dealing with video-based applications such as 2D to 3D video conversion, however, these approaches tend to produce temporally inconsistent depth maps, since their CNN models are optimized over single frames. In this paper, we address this problem by introducing a novel spatial-temporal Conditional Random Fields (CRF) model into the DCNN architecture, which is able to enforce temporal consistency between depth map estimations over consecutive video frames. In our approach, temporally consistent superpixel (TSP) is first applied to an image sequence to establish correspondence of targets in consecutive frames. A DCNN network is then used to regress the depth value of each temporal superpixel, followed by a spatial-temporal CRF layer to model the relationship of the estimated depths in both spatial and temporal domain. The parameters in both DCNN and CRF models are jointly optimized with back propagation. Experimental results show that our approach not only is able to significantly enhance the temporal consistency of estimated depth maps over existing single-frame-based approaches, but also improves the depth estimation accuracy in terms of various evaluation metrics.
    References | Related Articles | Metrics
    Medical Sign Recognition of Lung Nodules based on Image Retrieval with Semantic Feature and Supervised Hashing
    Juan-Juan Zhao, Ling Pan, Peng-Fei Zhao, Xiao-Xian Tang
    Journal of Computer Science and Technology, 2017, 32 (3): 457-469.  DOI: 10.1007/s11390-017-1736-9
    Abstract   PDF(889KB) ( 1104 )   Chinese Summary
    Computer-aided diagnosis (CAD) technology can improve the efficiency of a physician diagnosing lung lesions, especially sign recognition, which is important for identifying benign and malignant nodules. This paper proposes a new sign recognition method based on image retrieval for lung nodules. First, we construct a deep learning framework to extract semantic features that can effectively represent sign information. Then, we translate the high-dimensional image features into compact binary codes with principal component analysis (PCA) and supervised hashing. Next, we retrieve similar lung nodule images with the presented adaptive-weighted similarity calculation method. Finally, we recognize nodule signs from the retrieval results, which can also provide decision support for diagnosis of lung lesions. The proposed method is validated on the publicly available databases Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI) and Lung Computed Tomography (CT) Imaging Signs (LISS). The experimental results demonstrate our method substantially improves retrieval performance and can achieve 87.29%. The entire recognition rate on the basis of the retrieval results can achieve 93.52%. Moreover, our method is also effective for real-life diagnosis data.
    References | Related Articles | Metrics
    Collective Representation for Abnormal Event Detection
    Renzhen Ye, Xuelong Li
    Journal of Computer Science and Technology, 2017, 32 (3): 470-479.  DOI: 10.1007/s11390-017-1737-8
    Abstract   PDF(1874KB) ( 1114 )   Chinese Summary
    Abnormal event detection in crowded scenes is a hot topic in computer vision and information retrieval community. In this paper, we study the problems of detecting anomalous behaviors within the video, and propose a robust collective representation with multi-feature descriptors for abnormal event detection. The proposed method represents different features in an identical representation, in which different features of the same topic will show more common properties. Then, we build the intrinsic relation between different feature descriptors and capture concept drift in the video sequence, which can robustly discriminate between abnormal events and normal events. Experimental results on two benchmark datasets and the comparison with the state-of-the-art methods validate the effectiveness of our method.
    References | Related Articles | Metrics
    Captioning Videos Using Large-Scale Image Corpus
    Xiao-Yu Du, Yang Yang, Liu Yang, Fu-Min Shen, Zhi-Guang Qin, Jin-Hui Tang
    Journal of Computer Science and Technology, 2017, 32 (3): 480-493.  DOI: 10.1007/s11390-017-1738-7
    Abstract   PDF(6708KB) ( 1199 )   Chinese Summary
    Video captioning is the task of assigning complex high-level semantic descriptions (e.g., sentences or paragraphs) to video data. Different from previous video analysis techniques such as video annotation, video event detection and action recognition, video captioning is much closer to human cognition with smaller semantic gap. However, the scarcity of captioned video data severely limits the development of video captioning. In this paper, we propose a novel video captioning approach to describe videos by leveraging freely-available image corpus with abundant literal knowledge. There are two key aspects of our approach: 1) effective integration strategy bridging videos and images, and 2) high efficiency in handling ever-increasing training data. To achieve these goals, we adopt sophisticated visual hashing techniques to efficiently index and search large-scale images for relevant captions, which is of high extensibility to evolving data and the corresponding semantics. Extensive experimental results on various real-world visual datasets show the effectiveness of our approach with different hashing techniques, e.g., LSH (locality-sensitive hashing), PCA-ITQ (principle component analysis iterative quantization) and supervised discrete hashing, as compared with the state-of-the-art methods. It is worth noting that the empirical computational cost of our approach is much lower than that of an existing method, i.e., it takes 1/256 of the memory requirement and 1/64 of the time cost of the method of Devlin et al.
    References | Related Articles | Metrics
    Automatic Colorization with Improved Spatial Coherence and Boundary Localization
    Wei Zhang, Chao-Wei Fang, Guan-Bin Li
    Journal of Computer Science and Technology, 2017, 32 (3): 494-506.  DOI: 10.1007/s11390-017-1739-6
    Abstract   PDF(4063KB) ( 1082 )   Chinese Summary
    Grayscale image colorization is an important computer graphics problem with a variety of applications. Recent fully automatic colorization methods have made impressive progress by formulating image colorization as a pixel-wise prediction task and utilizing deep convolutional neural networks. Though tremendous improvements have been made, the result of automatic colorization is still far from perfect. Specifically, there still exist common pitfalls in maintaining color consistency in homogeneous regions as well as precisely distinguishing colors near region boundaries. To tackle these problems, we propose a novel fully automatic colorization pipeline which involves a boundary-guided CRF (conditional random field) and a CNN-based color transform as post-processing steps. In addition, as there usually exist multiple plausible colorization proposals for a single image, automatic evaluation for different colorization methods remains a challenging task. We further introduce two novel automatic evaluation schemes to efficiently assess colorization quality in terms of spatial coherence and localization. Comprehensive experiments demonstrate great quality improvement in results of our proposed colorization method under multiple evaluation metrics.
    References | Related Articles | Metrics
    Discriminative Histogram Intersection Metric Learning and Its Applications
    Peng-Yi Hao, Yang Xia, Xiao-Xin Li, Sei-ichiro Kamata, Sheng-Yong Chen
    Journal of Computer Science and Technology, 2017, 32 (3): 507-519.  DOI: 10.1007/s11390-017-1740-0
    Abstract   PDF(2438KB) ( 1030 )   Chinese Summary
    In this paper, a novel method called discriminative histogram intersection metric learning (DHIML) is proposed for pair matching and classification. Specifically, we introduce a discrimination term for learning a metric from binary information such as same/not-same or similar/dissimilar, and then combine it with the classification error for the discrimination in classifier construction. Compared with conventional approaches, the proposed method has several advantages. 1) The histogram intersection strategy is adopted into metric learning to deal with the widely used histogram features effectively. 2) By introducing discriminative term and classification error term into metric learning, a more discriminative distance metric and a classifier can be learned together. 3) The objective function is robust to outliers and noises for both features and labels in the training. The performance of the proposed method is tested on four applications: face verification, face-track identification, face-track clustering, and image classification. Evaluations on the challenging restricted protocol of Labeled Faces in the Wild (LFW) benchmark, a dataset with more than 7000 face-tracks, and Caltech-101 dataset validate the robustness and discriminability of the proposed metric learning, compared with the recent state-of-the-art approaches.
    References | Related Articles | Metrics
    Prior-Free Dependent Motion Segmentation Using Helmholtz-Hodge Decomposition Based Object-Motion Oriented Map
    Cui-Cui Zhang, Zhi-Lei Liu
    Journal of Computer Science and Technology, 2017, 32 (3): 520-535.  DOI: 10.1007/s11390-017-1741-z
    Abstract   PDF(5496KB) ( 869 )   Chinese Summary
    Motion segmentation in moving camera videos is a very challenging task because of the motion dependence between the camera and moving objects. Camera motion compensation is recognized as an effective approach. However, existing work depends on prior-knowledge on the camera motion and scene structure for model selection. This is not always available in practice. Moreover, the image plane motion suffers from depth variations, which leads to depth-dependent motion segmentation in 3D scenes. To solve these problems, this paper develops a prior-free dependent motion segmentation algorithm by introducing a modified Helmholtz-Hodge decomposition (HHD) based object-motion oriented map (OOM). By decomposing the image motion (optical flow) into a curl-free and a divergence-free component, all kinds of camera-induced image motions can be represented by these two components in an invariant way. HHD identifies the camera-induced image motion as one segment irrespective of depth variations with the help of OOM. To segment object motions from the scene, we deploy a novel spatio-temporal constrained quadtree labeling. Extensive experimental results on benchmarks demonstrate that our method improves the performance of the state-of-the-art by 10%~20% even over challenging scenes with complex background.
    References | Related Articles | Metrics
    A Survey on Human Performance Capture and Animation
    Shihong Xia, Lin Gao, Yu-Kun Lai, Ming-Ze Yuan, Jinxiang Chai
    Journal of Computer Science and Technology, 2017, 32 (3): 536-554.  DOI: 10.1007/s11390-017-1742-y
    Abstract   PDF(1566KB) ( 1372 )   Chinese Summary
    With the rapid development of computing technology, three-dimensional (3D) human body models and their dynamic motions are widely used in the digital entertainment industry. Human performance mainly involves human body shapes and motions. Key research problems in human performance animation include how to capture and analyze static geometric appearance and dynamic movement of human bodies, and how to simulate human body motions with physical effects. In this survey, according to the main research directions of human body performance capture and animation, we summarize recent advances in key research topics, namely human body surface reconstruction, motion capture and synthesis, as well as physics-based motion simulation, and further discuss future research problems and directions. We hope this will be helpful for readers to have a comprehensive understanding of human performance capture and animation.
    References | Related Articles | Metrics
    Computer Network and Information Security
    Minimizing Resource Cost for Camera Stream Scheduling in Video Data Center
    Yi-Hong Gao, Hua-Dong Ma, Wu Liu
    Journal of Computer Science and Technology, 2017, 32 (3): 555-570.  DOI: 10.1007/s11390-017-1743-x
    Abstract   PDF(552KB) ( 823 )   Chinese Summary
    Video surveillance service, which receives live streams from IP cameras and forwards the streams to end users, has become one of the most popular services of video data center. The video data center focuses on minimizing the resource cost during resource provisioning for the service. However, little of the previous work comprehensively considers the bandwidth cost optimization of both upload and forwarding streams, and the capacity of the media server. In this paper, we propose an efficient resource scheduling approach for online multi-camera video forwarding, which tries to optimize the resource sharing of media servers and the networks together. Firstly, we not only provide a fine-grained resource usage model for media servers, but also evaluate the bandwidth cost of both upload and forwarding streams. Without loss of generality, we utilize two resource pricing models with different resource cost functions to evaluate the resource cost: the linear cost function and the non-linear cost functions. Then, we formulate the cost minimization problem as a constrained integer programming problem. For the linear resource cost function, the drift-plus-penalty optimization method is exploited in our approach. For non-linear resource cost functions, the approach employs a heuristic method to reduce both media server cost and bandwidth cost. The experimental results demonstrate that our approach obviously reduces the total resource costs on both media servers and networks simultaneously.
    References | Related Articles | Metrics
    Two-Type Information Fusion Based IP-to-AS Mapping Table Refining
    Qing Jiang, Hang-Yu Hu, Guang-Min Hu
    Journal of Computer Science and Technology, 2017, 32 (3): 571-584.  DOI: 10.1007/s11390-017-1744-9
    Abstract   PDF(529KB) ( 680 )   Chinese Summary
    The Internet topology at the autonomous system (AS) level is of great importance, and traceroute has been known to be a potential tool to obtain a complete AS topology. The original IP-to-AS mapping table maps the IP addresses in traceroute paths to their origin ASes, which may cause false AS links. The existing methods refine the original mapping table based on traceroute-BGP path pairs or alias resolution data. However, the information extracted from either of them is inaccurate and incomplete. In this paper, we present a two-type information fusion based method to refine the original mapping table. We extract four kinds of information from path pair and alias resolution data. Based on these information, we build a candidate AS set for each router. Then we choose the AS that is consistent with the existing information to be the owner AS of each router and map all of the IP addresses on the router to it. We validate the result with the ground truth from PeeringDB and Looking Glass severs. Compared with the existing methods, our method produces a more accurate mapping table. In addition, we discuss the coverage of our method and show that our method is convergent and more robust against the reduction of information or the increase of incorrect information.
    References | Related Articles | Metrics
    Protecting User Privacy in a Multi-Path Information-Centric Network Using Multiple Random-Caches
    Wei-Bo Chu, Li-Fang Wang, Ze-Jun Jiang, Alan Chin-Chen Chang
    Journal of Computer Science and Technology, 2017, 32 (3): 585-598.  DOI: 10.1007/s11390-017-1730-2
    Abstract   PDF(600KB) ( 899 )   Chinese Summary
    In-network caching is a fundamental mechanism advocated by information-centric networks (ICNs) for efficient content delivery. However, this new mechanism also brings serious privacy risks due to cache snooping attacks. One effective solution to this problem is random-cache, where the cache in a router randomly mimics a cache hit or a cache miss for each content request/probe. In this paper, we investigate the effectiveness of using multiple random-caches to protect cache privacy in a multi-path ICN. We propose models for characterizing the privacy of multi-path ICNs with random-caches, and analyze two different attack scenarios: 1) prefix-based attacks and 2) suffix-based attacks. Both homogeneous and heterogeneous caches are considered. Our analysis shows that in a multi-path ICN an adversary can potentially gain more privacy information by adopting prefix-based attacks. Furthermore, heterogeneous caches provide much better privacy protection than homogeneous ones under both attacks. The effect of different parameters on the privacy of multi-path random-caches is further investigated, and the comparison with its single-path counterpart is carried out based on numerical evaluations. The analysis and results in this paper provide insights in designing and evaluating multi-path ICNs when we take privacy into consideration.
    References | Related Articles | Metrics
    Theory and Algorithms
    Private Keyword-Search for Database Systems Against Insider Attacks
    Peng Jiang, Yi Mu, Fuchun Guo, Qiao-Yan Wen
    Journal of Computer Science and Technology, 2017, 32 (3): 599-617.  DOI: 10.1007/s11390-017-1745-8
    Abstract   PDF(999KB) ( 1125 )   Chinese Summary
    The notion of searchable encrypted keywords introduced an elegant approach to retrieve encrypted data without the need of decryption. Since the introduction of this notion, there are two main searchable encrypted keywords techniques, symmetric searchable encryption (SSE) and public key encryption with keyword search (PEKS). Due to the complicated key management problem in SSE, a number of concrete PEKS constructions have been proposed to overcome it. However, the security of these PEKS schemes was only weakly defined in presence of outsider attacks; therefore they suffer from keyword guessing attacks from the database server as an insider. How to resist insider attacks remains a challenging problem. We propose the first searchable encrypted keywords against insider attacks (SEK-IA) framework to address this problem. The security model of SEK-IA under public key environment is rebuilt. We give a concrete SEK-IA construction featured with a constant-size trapdoor and the proposed scheme is formally proved to be secure against insider attacks. The performance evaluations show that the communication cost between the receiver and the server in our SEK-IA scheme remains constant, independent of the sender identity set size, and the receiver needs the minimized computational cost to generate a trapdoor to search the data from multiple senders.
    References | Related Articles | Metrics
    Optimal Path Embedding in the Exchanged Crossed Cube
    Dong-Fang Zhou, Jian-Xi Fan, Cheng-Kuan Lin, Bao-Lei Cheng, Jing-Ya Zhou, Zhao Liu
    Journal of Computer Science and Technology, 2017, 32 (3): 618-629.  DOI: 10.1007/s11390-017-1729-8
    Abstract   PDF(443KB) ( 836 )   Chinese Summary
    The (s + t +1)-dimensional exchanged crossed cube, denoted as ECQ(s, t), combines the strong points of the exchanged hypercube and the crossed cube. It has been proven that ECQ(s, t) has more attractive properties than other variations of the fundamental hypercube in terms of fewer edges, lower cost factor and smaller diameter. In this paper, we study the embedding of paths of distinct lengths between any two different vertices in ECQ(s, t). We prove the result in ECQ(s, t): if s > 3, t > 3, for any two different vertices, all paths whose lengths are between max{9, 「s+1/2」+「t+1/2」+4} and 2s+t+1-1 can be embedded between the two vertices with dilation 1. Note that the diameter of ECQ(s, t) is s+1/2」+「t+1/2」+2. The obtained result is optimal in the sense that the dilations of path embeddings are all 1. The result reveals the fact that ECQ(s, t) preserves the path embedding capability to a large extent, while it only has about one half edges of CQn.
    References | Related Articles | Metrics
    Regular Paper
    Length-Changeable Incremental Extreme Learning Machine
    You-Xi Wu, Dong Liu, He Jiang
    Journal of Computer Science and Technology, 2017, 32 (3): 630-643.  DOI: 10.1007/s11390-017-1746-7
    Abstract   PDF(778KB) ( 1014 )   Chinese Summary
    Extreme learning machine (ELM) is a learning algorithm for generalized single-hidden-layer feed-forward networks (SLFNs). In order to obtain a suitable network architecture, Incremental Extreme Learning Machine (I-ELM) is a sort of ELM constructing SLFNs by adding hidden nodes one by one. Although kinds of I-ELM-class algorithms were proposed to improve the convergence rate or to obtain minimal training error, they do not change the construction way of I-ELM or face the over-fitting risk. Making the testing error converge quickly and stably therefore becomes an important issue. In this paper, we proposed a new incremental ELM which is referred to as Length-Changeable Incremental Extreme Learning Machine (LCI-ELM). It allows more than one hidden node to be added to the network and the existing network will be regarded as a whole in output weights tuning. The output weights of newly added hidden nodes are determined using a partial error-minimizing method. We prove that an SLFN constructed using LCI-ELM has approximation capability on a universal compact input set as well as on a finite training set. Experimental results demonstrate that LCI-ELM achieves higher convergence rate as well as lower over-fitting risk than some competitive I-ELM-class algorithms.
    References | Related Articles | Metrics
    EntityManager: Managing Dirty Data Based on Entity Resolution
    Xue-Li Liu, Hong-Zhi Wang, Jian-Zhong Li, Hong Gao
    Journal of Computer Science and Technology, 2017, 32 (3): 644-661.  DOI: 10.1007/s11390-017-1731-1
    Abstract   PDF(994KB) ( 970 )   Chinese Summary
    Data quality is important in many data-driven applications, such as decision making, data analysis, and data mining. Recent studies focus on data cleaning techniques by deleting or repairing the dirty data, which may cause information loss and bring new inconsistencies. To avoid these problems, we propose EntityManager, a general system to manage dirty data without data cleaning. This system takes real-world entity as the basic storage unit and retrieves query results according to the quality requirement of users. The system is able to handle all kinds of inconsistencies recognized by entity resolution. We elaborate the EntityManager system, covering its architecture, data model, and query processing techniques. To process queries efficiently, our system adopts novel indices, similarity operator and query optimization techniques. Finally, we verify the efficiency and effectiveness of this system and present future research challenges.
    References | Related Articles | Metrics
  Journal Online
Just Accepted
Top Cited Papers
Top 30 Most Read
Paper Lists of Areas
Special Issues
   ScholarOne Manuscripts
   Log In

User ID:


  Forgot your password?

Enter your e-mail address to receive your account information.

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved