Bimonthly    Since 1986
ISSN 1000-9000(Print)
CN 11-2296/TP
Indexed in:
Publication Details
Edited by: Editorial Board of Journal Of Computer Science and Technology
P.O. Box 2704, Beijing 100190, P.R. China
Sponsored by: Institute of Computing Technology, CAS & China Computer Federation
Undertaken by: Institute of Computing Technology, CAS
Distributed by:
China: All Local Post Offices
Other Countries: Springer
  • Table of Content
      09 July 2010, Volume 25 Issue 4 Previous Issue    Next Issue
    For Selected: View Abstracts Toggle Thumbnails
    Special Section on Advances in Machine Learning and Applications
    Zhi-Hua Zhou and Hang Li
    Journal of Computer Science and Technology, 2010, 25 (4): 651-652.  DOI: 10.1007/s11390-010-1050-2
    Abstract   PDF(70KB) ( 1571 )   Chinese Summary
    Related Articles | Metrics
    Dirichlet Process Gaussian Mixture Models: Choice of the Base Distribution
    Dilan Gõrür and Carl Edward Rasmussen
    Journal of Computer Science and Technology, 2010, 25 (4): 653-664.  DOI: 10.1007/s11390-010-1051-1
    Abstract   PDF(437KB) ( 2016 )   Chinese Summary

    In the Bayesian mixture modeling framework it is possible to infer the necessary number of components to model the data and therefore it is unnecessary to explicitly restrict the number of components. Nonparametric mixture models sidestep the problem of finding the "correct" number of mixture components by assuming infinitely many components. In this paper Dirichlet process mixture (DPM) models are cast as infinite mixture models and inference using Markov chain Monte Carlo is described. The specification of the priors on the model parameters is often guided by mathematical and practical convenience. The primary goal of this paper is to compare the choice of conjugate and non-conjugate base distributions on a particular class of DPM models which is widely used in applications, the Dirichlet process Gaussian mixture model (DPGMM). We compare computational efficiency and modeling performance of DPGMM defined using a conjugate and a conditionally conjugate base distribution. We show that better density models can result from using a wider class of priors with no or only a modest increase in computational effort.

    References | Related Articles | Metrics
    Model Failure and Context Switching Using Logic-Based Stochastic Models
    Nikita A. Sakhanenko and George F. Luger
    Journal of Computer Science and Technology, 2010, 25 (4): 665-680.  DOI: 10.1007/s11390-010-1052-0
    Abstract   PDF(525KB) ( 1663 )   Chinese Summary

    This paper addresses parameter drift in stochastic models. We define a notion of context that represents invariant, stable-over-time behavior and we then propose an algorithm for detecting context changes in processing a stream of data. A context change is seen as model failure, when a probabilistic model representing current behavior is no longer able to "fit" newly encountered data. We specify our stochastic models using a first-order logic-based probabilistic modeling language called Generalized Loopy Logic (GLL). An important component of GLL is its learning mechanism that can identify context drift. We demonstrate how our algorithm can be incorporated into a failure-driven context-switching probabilistic modeling framework and offer several examples of its application.

    References | Related Articles | Metrics
    Combining Committee-Based Semi-Supervised Learning and Active Learning
    Mohamed Farouk Abdel Hady and Friedhelm Schwenker
    Journal of Computer Science and Technology, 2010, 25 (4): 681-698.  DOI: DOI 10.1007/s11390-010-1053-z
    Abstract   PDF(585KB) ( 2134 )   Chinese Summary

    Many data mining applications have a large amount of data but labeling data is usually difficult, expensive, or time consuming, as it requires human experts for annotation. Semi-supervised learning addresses this problem by using unlabeled data together with labeled data in the training process. Co-Training is a popular semi-supervised learning algorithm that has the assumptions that each example is represented by multiple sets of features (views) and these views are sufficient for learning and independent given the class. However, these assumptions are strong and are not satisfied in many real-world domains. In this paper, a single-view variant of Co-Training, called Co-Training by Committee (CoBC) is proposed, in which an ensemble of diverse classifiers is used instead of redundant and independent views. We introduce a new labeling confidence measure for unlabeled examples based on estimating the local accuracy of the committee members on its neighborhood. Then we introduce two new learning algorithms, QBC-then-CoBC and QBC-with-CoBC, which combine the merits of committee-based semi-supervised learning and active learning. The random subspace method is applied on both C4.5 decision trees and 1-nearest neighbor classifiers to construct the diverse ensembles used for semi-supervised learning and active learning. Experiments show that these two combinations can outperform other non committee-based ones.

    References | Related Articles | Metrics
    Ordinal-Class Core Vector Machine
    Bin Gu(顾 彬), Jian-Dong Wang(王建东), and Tao Li(李 涛)
    Journal of Computer Science and Technology, 2010, 25 (4): 699-708.  DOI: 10.1007/s11390-010-1054-y
    Abstract   PDF(356KB) ( 1689 )   Chinese Summary

    Ordinal regression is one of the most important tasks of relation learning, and several techniques based on support vector machines (SVMs) have also been proposed for tackling it, but the scalability aspect of these approaches to handle large datasets still needs much of exploration. In this paper, we will extend the recent proposed algorithm Core Vector Machine (CVM) to the ordinal-class data, and propose a new algorithm named as Ordinal-Class Core Vector Machine (OCVM). Similar with CVM, its asymptotic time complexity is linear with the number of training samples, while the space complexity is independent with the number of training samples. We also give some analysis for OCVM, which mainly includes two parts, the first one shows that OCVM can guarantee that the biases are unique and properly ordered under some situation; the second one illustrates the approximate convergence of the solution from the viewpoints of objective function and KKT conditions. Experiments on several synthetic and real world datasets demonstrate that OCVM scales well with the size of the dataset and can achieve comparable generalization performance with existing SVM implementations.

    References | Related Articles | Metrics
    Learning with Uncertain Kernel Matrix Set
    Lei Jia(贾 磊), Shi-Zhong Liao(廖士中), Member, CCF, and Li-Zhong Ding(丁立中)
    Journal of Computer Science and Technology, 2010, 25 (4): 709-727.  DOI: 10.1007/s11390-010-1055-x
    Abstract   PDF(837KB) ( 1712 )   Chinese Summary

    We study support vector machines (SVM) for which the kernel matrix is not specified exactly and it is only known to belong to a given uncertainty set. We consider uncertainties that arise from two sources: (i) data measurement uncertainty, which stems from the statistical errors of input samples; (ii) kernel combination uncertainty, which stems from the weight of individual kernel that needs to be optimized in multiple kernel learning (MKL) problem. Much work has been studied, such as uncertainty sets that allow the corresponding SVMs to be reformulated as semi-definite programs (SDPs), which is very computationally expensive however. Our focus in this paper is to identify uncertainty sets that allow the corresponding SVMs to be reformulated as second-order cone programs (SOCPs), since both the worst case complexity and practical computational effort required to solve SOCPs is at least an order of magnitude less than that needed to solve SDPs of comparable size. In the main part of the paper we propose four uncertainty sets that meet this criterion. Experimental results are presented to confirm the validity of these SOCP reformulations.

    References | Related Articles | Metrics
    Learning Query Ambiguity Models by Using Search Logs
    Ruihua Song(宋睿华), Member, ACM, Zhicheng Dou(窦志成), Hsiao-Wuen Hon(洪小文), Fellow, IEEE and Yong Yu(俞 勇)
    Journal of Computer Science and Technology, 2010, 25 (4): 728-738.  DOI: 10.1007/s11390-010-1056-9
    Abstract   PDF(306KB) ( 2306 )   Chinese Summary

    Identifying ambiguous queries is crucial to research on personalized Web search and search result diversity. Intuitively, query logs contain valuable information on how many intentions users have when issuing a query. However, previous work showed user clicks alone are misleading in judging a query as being ambiguous or not. In this paper, we address the problem of learning a query ambiguity model by using search logs. First, we propose enriching a query by mining the documents clicked by users and the relevant follow up queries in a session. Second, we use a text classifier to map the documents and the queries into predefined categories. Third, we propose extracting features from the processed data. Finally, we apply a state-of-the-art algorithm, Support Vector Machine (SVM), to learn a query ambiguity classifier. Experimental results verify that the sole use of click based features or session based features perform worse than the previous work based on top retrieved documents. When we combine the two sets of features, our proposed approach achieves the best effectiveness, specifically 86% in terms of accuracy. It significantly improves the click based method by 5.6% and the session based method by 4.6%.

    References | Related Articles | Metrics
    A New Approach for Multi-Document Update Summarization
    Chong Long(龙 翀), Min-Lie Huang(黄民烈), Xiao-Yan Zhu(朱小燕), Member, CCF and Ming Li(李 明), Fellow, ACM, IEEE
    Journal of Computer Science and Technology, 2010, 25 (4): 739-749.  DOI: 10.1007/s11390-010-1057-8
    Abstract   PDF(347KB) ( 1947 )   Chinese Summary

    Fast changing knowledge on the Internet can be acquired more efficiently with the help of automatic document summarization and updating techniques. This paper describes a novel approach for multi-document update summarization. The best summary is defined to be the one which has the minimum information distance to the entire document set. The best update summary has the minimum conditional information distance to a document cluster given that a prior document cluster has already been read. Experiments on the DUC/TAC 2007 to 2009 datasets (http://duc.nist.gov/, http://www.nist.gov/tac/) have proved that our method closely correlates with the human summaries and outperforms other programs such as LexRank in many categories under the ROUGE evaluation criterion.

    References | Related Articles | Metrics
    Multiple Hypergraph Clustering ofWeb Images by MiningWord2Image Correlations
    Fei Wu(吴 飞), Senior Member, CCF, Ya-Hong Han(韩亚洪), and Yue-Ting Zhuang(庄越挺), Member, IEEE
    Journal of Computer Science and Technology, 2010, 25 (4): 750-760.  DOI: 10.1007/s11390-010-1058-7
    Abstract   PDF(1124KB) ( 3150 )   Chinese Summary

    In this paper, we consider the problem of clustering Web images by mining correlations between images and their corresponding words. Since Web images always come with associated text, the corresponding textual tags of Web images are used as a source to enhance the description of Web images. However, each word has different contribution for the interpretation of image semantics. Therefore, in order to evaluate the importance of each corresponding word of Web images, we propose a novel visibility model to compute the extent to which a word can be perceived visually in images, and then infer the correlation of word to image by the integration of visibility with tf-idf. Furthermore, Latent Dirichlet Allocation (LDA) is used to discover topic information inherent in surrounding text and topic correlations of images could be defined for image clustering. For integrating visibility and latent topic information into an image clustering framework, we first represent textual correlated and latent-topic correlated images by two hypergraph views, and then the proposed Spectral Multiple Hypergraph Clustering (SMHC) algorithm is used to cluster images into categories. The SMHC could be regarded as a new unsupervised learning process with two hypergraphs to classify Web images. Experimental results show that the SMHC algorithm has better clustering performance and the proposed SMHC-based image clustering framework is effective.

    References | Related Articles | Metrics
    2D Correlative-Chain Conditional Random Fields for Semantic Annotation of Web Objects
    Yan-Hui Ding(丁艳辉), Member, CCF, Qing-Zhong Li(李庆忠), Senior Member, CCF Yong-Quan Dong(董永权), Member, CCF, and Zhao-Hui Peng(彭朝晖), Member, CCF
    Journal of Computer Science and Technology, 2010, 25 (4): 761-770.  DOI: 10.1007/s11390-010-1059-6
    Abstract   PDF(325KB) ( 1531 )   Chinese Summary

    Semantic annotation of Web objects is a key problem for Web information extraction. The Web contains an abundance of useful semi-structured information about real world objects, and the empirical study shows that strong two-dimensional sequence characteristics and correlative characteristics exist for Web information about objects of the same type across different Web sites. Conditional Random Fields (CRFs) are the state-of-the-art approaches taking the sequence characteristics to do better labeling. However, as the appearance of correlative characteristics between Web object elements, previous CRFs have their limitations for semantic annotation of Web objects and cannot deal with the long distance dependencies between Web object elements efficiently. To better incorporate the long distance dependencies, on one hand, this paper describes long distance dependencies by correlative edges, which are built by making good use of structured information and the characteristics of records from external databases; and on the other hand, this paper presents a two-dimensional Correlative-Chain Conditional Random Fields (2DCC-CRFs) to do semantic annotation of Web objects. This approach extends a classic model, two-dimensional Conditional Random Fields (2DCRFs), by adding correlative edges. Experimental results using a large number of real-world data collected from diverse domains show that the proposed approach can significantly improve the semantic annotation accuracy of Web objects.

    References | Related Articles | Metrics
    Multimodal Biometric Score Fusion Using Gaussian Mixture Model and Monte Carlo Method
    R Raghavendra, Rao Ashok, and G Hemantha Kumar
    Journal of Computer Science and Technology, 2010, 25 (4): 771-782.  DOI: 10.1007/s11390-010-1060-0
    Abstract   PDF(615KB) ( 1648 )   Chinese Summary

    Multimodal biometric fusion is gaining more attention among researchers in recent days. As multimodal biometric system consolidates the information from multiple biometric sources, the effective fusion of information obtained at score level is a challenging task. In this paper, we propose a framework for optimal fusion of match scores based on Gaussian Mixture Model (GMM) and Monte Carlo sampling based hypothesis testing. The proposed fusion approach has the ability to handle: 1) small size of match scores as is more commonly encountered in biometric fusion, and 2) arbitrary distribution of match scores which is more pronounced when discrete scores and multimodal features are present. The proposed fusion scheme is compared with well established schemes such as Likelihood Ratio (LR) method and weighted SUM rule. Extensive experiments carried out on five different multimodal biometric databases indicate that the proposed fusion scheme achieves higher performance as compared with other contemporary state of art fusion techniques.

    References | Related Articles | Metrics
    Robust Feature Extraction for Speaker Recognition Based on Constrained Nonnegative Tensor Factorization
    Qiang Wu(吴 强), Li-Qing Zhang(张丽清), and Guang-Chuan Shi(石光川)
    Journal of Computer Science and Technology, 2010, 25 (4): 783-792.  DOI: 10.1007/s11390-010-1061-z
    Abstract   PDF(425KB) ( 1642 )   Chinese Summary

    How to extract robust feature is an important research topic in machine learning community. In this paper, we investigate robust feature extraction for speech signal based on tensor structure and develop a new method called constrained Nonnegative Tensor Factorization (cNTF). A novel feature extraction framework based on the cortical representation in primary auditory cortex (A1) is proposed for robust speaker recognition. Motivated by the neural firing rates model in A1, the speech signal first is represented as a general higher order tensor. cNTF is used to learn the basis functions from multiple interrelated feature subspaces and find a robust sparse representation for speech signal. Computer simulations are given to evaluate the performance of our method and comparisons with existing speaker recognition methods are also provided. The experimental results demonstrate that the proposed method achieves higher recognition accuracy in noisy environment.

    References | Related Articles | Metrics
    Information Security
    New Constructions for Identity-Based Unidirectional Proxy Re-Encryption
    Jun-Zuo Lai(赖俊祚), Wen-Tao Zhu(朱文涛), Member, IEEE Robert H. Deng(邓慧杰), Senior Member, IEEE, Sheng-Li Liu(刘胜利), and Wei-Dong Kou(寇卫东)
    Journal of Computer Science and Technology, 2010, 25 (4): 793-806.  DOI: 10.1007/s11390-010-1062-y
    Abstract   PDF(337KB) ( 2117 )   Chinese Summary

    We address the cryptographic topic of proxy re-encryption (PRE), which is a special public-key cryptosystem. A PRE scheme allows a special entity, known as the proxy, to transform a message encrypted with the public key of a delegator (say Alice), into a new ciphertext that is protected under the public key of a delegatee (say Bob), and thus the same message can then be recovered with Bob's private key. In this paper, in the identity-based setting, we first investigate the relationship between so called mediated encryption and unidirectional PRE. We provide a general framework which converts any secure identity-based unidirectional PRE scheme into a secure identity-based mediated encryption scheme, and vice versa. Concerning the security for unidirectional PRE schemes, Ateniese et al. previously suggested an important property known as the master secret security, which requires that the coalition of the proxy and Bob cannot expose Alice's private key. In this paper, we extend the notion to the identity-based setting, and present an identity-based unidirectional PRE scheme, which not only is provably secure against the chosen ciphertext attack in the standard model but also achieves the master secret security at the same time.

    References | Related Articles | Metrics
    Generic Certificateless Encryption Secure Against Malicious-but-Passive KGC Attacks in the Standard Model
    Qiong Huang(黄 琼) and Duncan S. Wong(王 石)
    Journal of Computer Science and Technology, 2010, 25 (4): 807-826.  DOI: 10.1007/s11390-010-1063-x
    Abstract   PDF(408KB) ( 1838 )   Chinese Summary

    Despite the large number of certificateless encryption schemes proposed recently, many of them have been found insecure under a practical attack, called malicious-but-passive KGC (Key Generation Center) attack. In this work we propose the first generic construction of certificateless encryption, which can be proven secure against malicious-but-passive KGC attacks in the standard model. In order to encrypt a message of any length, we consider the KEM/DEM (key encapsulation mechanism/data encapsulation mechanism) framework in the certificateless setting, and propose a generic construction of certificateless key encapsulation mechanism (CL-KEM) secure against malicious-but-passive KGC attacks in the standard model. It is based on an identity-based KEM, a public key encryption and a message authentication code. The high efficiency of our construction is due to the efficient implementations of these underlying building blocks, and is comparable to Bentahar et al.'s CL-KEMs, which have only been proven secure under the random oracle model with no consideration of the malicious-but-passive KGC attack.
    We also introduce the notion of certificateless tag-based KEM (CL-TKEM), which is an extension of Abe et al.'s work to the certificateless setting. We show that an efficient CL-TKEM can be constructed by modifying our CL-KEM scheme. We also show that with a CL-TKEM and a data encapsulation mechanism secure under our proposed security model, an efficient certificateless hybrid encryption can be constructed by applying Abe et al.'s transformation in the certificateless setting.

    References | Related Articles | Metrics
    Algorithm and Complexity
    Certification of Thread Context Switching
    Yu Guo(郭 宇), Xin-Yu Jiang(蒋信予), and Yi-Yun Chen(陈意云)
    Journal of Computer Science and Technology, 2010, 25 (4): 827-840.  DOI: 10.1007/s11390-010-1064-9
    Abstract   PDF(541KB) ( 1673 )   Chinese Summary

    With recent efforts to build foundational certified software systems, two different approaches have been proposed to certify thread context switching. One is to certify both threads and context switching in a single logic system, and the other certifies threads and context switching at different abstraction levels. The former requires heavyweight extensions in the logic system to support first-class code pointers and recursive specifications. Moreover, the specification for context switching is very complex. The latter supports simpler and more natural specifications, but it requires the contexts of threads to be abstracted away completely when threads are certified. As a result, the conventional implementation of context switching used in most systems needs to be revised to make the abstraction work. In this paper, we extend the second approach to certify the conventional implementation, where the clear abstraction for threads is unavailable since both threads and context switching hold pointers of thread contexts. To solve this problem, we allow the program specifications for threads to refer to pointers of thread contexts. Thread contexts are treated as opaque structures, whose contents are unspecified and should never be accessed by the code of threads. Therefore, the advantage of avoiding the direct support of first-class code pointers is still preserved in our method. Besides, our new approach is also more lightweight. Instead of using two different logics to certify threads and context switching, we employ only one program logic with two different specifications for the context switching. One is used to certify the implementation itself, and the more abstract one is used as an interface between threads and context switching at a higher abstraction level. The consistency between the two specifications are enforced by the global program invariant.

    References | Related Articles | Metrics
    Formal Reasoning About Lazy-STM Programs
    Yong Li(李 勇), Yu Zhang(张 昱), Member, CCF, Yi-Yun Chen(陈意云), Member, CCF and Ming Fu(付 明)
    Journal of Computer Science and Technology, 2010, 25 (4): 841-852.  DOI: 10.1007/s11390-010-1065-8
    Abstract   PDF(410KB) ( 1980 )   Chinese Summary

    Transactional memory (TM) is an easy-using parallel programming model that avoids common problems associated with conventional locking techniques. Several researchers have proposed {a large amount of} alternative hardware and software TM implementations. However, few ones focus on formal reasoning about these TM programs. In this paper, we propose a framework at assembly level for reasoning about lazy software transactional memory (STM) programs. First, we give a software TM implementation based on lightweight locks. These locks are also one part of the shared memory. Then we define the semantics of the model operationally, and the lightweight locks in transaction are non-blocking, avoiding deadlocks {among} transactions. Finally we design a logic --- a combination of permission accounting in separation logic and concurrent separation logic --- to verify various properties of concurrent programs based on this machine model. The whole framework is formalized using a proof-carrying-code (PCC) framework.

    References | Related Articles | Metrics
    Sorting Unsigned Permutations by Weighted Reversals, Transpositions, and Transreversals
    Xiao-Wen Lou(娄晓文) and Da-Ming Zhu(朱大铭), Senior Member, CCF
    Journal of Computer Science and Technology, 2010, 25 (4): 853-863.  DOI: 10.1007/s11390-010-1066-7
    Abstract   PDF(447KB) ( 1626 )   Chinese Summary

    Reversals, transpositions and transreversals are common events in genome rearrangement. The genome rearrangement sorting problem is to transform one genome into another using the minimum number of given rearrangement operations. An integer permutation is used to represent a genome in many cases. It can be divided into disjoint strips with each strip denoting a block of consecutive integers. A singleton is a strip of one integer. And the genome rearrangement problem turns into the problem of sorting a permutation into the identity permutation equivalently. Hannenhalli and Pevzner designed a polynomial time algorithm for the unsigned reversal sorting problem on those permutations with O(log n) singletons. In this paper, first we describe one case in which Hannenhalli and Pevzner's algorithm may fail and propose a corrected approach. In addition, we propose a (1+ ε)-approximation algorithm for sorting unsigned permutations with O(\og n) singletons by reversals of weight 1 and transpositions/transreversals of weight 2.

    References | Related Articles | Metrics
    Architecture and High Performance Computer Systems
    Queue Waiting Time Aware Dynamic Workflow Scheduling in Multicluster Environments
    Zhi-Feng Yu(余志峰) and Wei-Song Shi(施巍松), Senior Member, IEEE
    Journal of Computer Science and Technology, 2010, 25 (4): 864-873.  DOI: 10.1007/s11390-010-1067-6
    Abstract   PDF(519KB) ( 1765 )   Chinese Summary

    Workflows are prevailing in scientific computation. Multicluster environments emerge and provide more resources, benefiting workflows but also challenging the traditional workflow scheduling heuristics. In a multicluster environment, each cluster has its own independent workload management system. Jobs are queued up before getting executed, they experience different resource availability and wait time if dispatched to different clusters. However, existing scheduling heuristics neither consider the queue wait time nor balance the performance gain with data movement cost. The proposed algorithm leverages the advancement of queue wait time prediction techniques and empirically studies if the tunability of resource requirements helps scheduling. The extensive experiment with both real workload traces and test bench shows that the queue wait time aware algorithm improves workflow performance by 3 to 10 times in terms of average makespan with relatively very low cost of data movement.

    References | Related Articles | Metrics
    A Unified Co-Processor Architecture for Matrix Decomposition
    Yong Dou(窦 勇), Member, CCF, ACM, IEEE, Jie Zhou(周 杰), Gui-Ming Wu(邬贵明) Jing-Fei Jiang(姜晶菲), Yuan-Wu Lei(雷元武), and Shi-Ce Ni(倪时策)
    Journal of Computer Science and Technology, 2010, 25 (4): 874-885.  DOI: 10.1007/s11390-010-1068-5
    Abstract   PDF(593KB) ( 1680 )   Chinese Summary

    QR and LU decompositions are the most important matrix decomposition algorithms. Many studies work on accelerating these algorithms by FPGA or ASIC in a case by case style. In this paper, we propose a unified framework for the matrix decomposition algorithms, combining three QR decomposition algorithms and LU algorithm with pivoting into a unified linear array structure. The QR and LU decomposition algorithms exhibit the same two-level loop structure and the same data dependency. Utilizing the similarities in loop structure and data dependency of matrix decomposition, we unify a fine-grained algorithm for all four matrix decomposition algorithms. Furthermore, we present a unified co-processor structure with a scalable linear array of processing elements (PEs), in which four types of PEs are same in the structure of memory channels and PE connections, but the only difference exists in the internal structure of data path. Our unified co-processor, which is IEEE 32-bit floating-point precision, is implemented and mapped onto a Xilinx Virtex5 FPGA chip. Experimental results show that our co-processors can achieve speedup of 2.3 to 14.9 factors compared to a Pentium Dual CPU with double SSE threads.

    References | Related Articles | Metrics
    Landing Stencil Code on Godson-T
    Hui-Min Cui(崔慧敏), Lei Wang(王 蕾), Dong-Rui Fan(范东睿), Member CCF, IEEE and Xiao-Bing Feng(冯晓兵)
    Journal of Computer Science and Technology, 2010, 25 (4): 886-894.  DOI: 10.1007/s11390-010-1069-4
    Abstract   PDF(503KB) ( 1499 )   Chinese Summary

    The advent of multi-core/many-core chip technology offers both an extraordinary opportunity and a profound challenge. In particular, computer architects and system software designers are faced with a unique opportunity to introducing new architecture features as well as adequate compiler technology --- together they may have profound impact. This paper presents a case study (using the 1-D Jacobi computation) of compiler-amendable performance optimization techniques on a many-core architecture Godson-T. Godson-T architecture has several unique features that are chosen for this study: 1) chip-level global addressable memory in particular the scratchpad memories (SPM) local to the processing cores; 2) fine-grain memory based synchronization (e.g., full-empty bit for fine-grain synchronization). Leveraging state-of-the-art performance optimization methods for 1-D stencil parallelization (e.g., timed tiling and variants), we developed and implement a number of many-core-based optimization for Godson-T. Our experimental study shows good performance in both execution time speedup and scalability, validate the value of globally accessed SPM and fine-grain synchronization mechanism (full-empty bits) under the Godson-T, and provides some useful guidelines for future compiler technology of many-core chip architectures.

    References | Related Articles | Metrics
  Journal Online
Just Accepted
Top Cited Papers
Top 30 Most Read
Paper Lists of Areas
Special Issues
   ScholarOne Manuscripts
   Log In

User ID:


  Forgot your password?

Enter your e-mail address to receive your account information.

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved