• Articles • Previous Articles     Next Articles

A Novel Approach to Revealing Positive and Negative Co-Regulated Genes

Yu-Hai Zhao, Guo-Ren Wang, Ying Yin, and Guang-Yu Xu   

  1. Department of Computer Science and Engineering, Northeastern University, Shengyang 110004, China
  • Received:2006-05-01 Revised:2006-12-19 Online:2007-03-10 Published:2007-03-10

As explored by biologists, there is a real and emerging need to identify co-regulated gene clusters, which include both positive and negative regulated gene clusters. However, the existing pattern-based and tendency-based clustering approaches are only designed for finding positive regulated gene clusters. In this paper, a new subspace clustering model called {g-Cluster} is proposed for gene expression data. The proposed model has the following advantages: $1)$ find both positive and negative co-regulated genes in a shot, $2)$ get away from the restriction of magnitude transformation relationship among co-regulated genes, and $3)$ guarantee quality of clusters and significance of regulations using a novel similarity measurement {gCode} and a user-specified regulation threshold $\delta$, respectively. No previous work measures up to the task which has been set. Moreover, MDL technique is introduced to avoid insignificant g-Clusters generated. A tree structure, namely GS-tree, is also designed, and two algorithms combined with efficient pruning and optimization strategies to identify all qualified g-Clusters. Extensive experiments are conducted on real and synthetic datasets. The experimental results show that $1)$ the algorithm is able to find an amount of co-regulated gene clusters missed by previous models, which are potentially of high biological significance, and $2)$ the algorithms are effective and efficient, and outperform the existing approaches.

Key words: Ada concurrent program; rendezvous relation; rendezvous ordering graph;

[1] Liu J, Wang W. Op-cluster: Clustering by tendency in high dimensional space. In -\it Proc. ICDM 2003 Conference}, Melbourne, USA, 2003, 187--194.

[2] Haixun Wang, Wei Wang, Jiong Yang, Philip S Yu. Clustering by pattern similarity in large data sets. In -\it Proc. the 2002 ACM SIGMOD Conference}, Wisconsin, 2002, pp.394--405.

[3] Jian Pei, Xiaoling Zhang, Moonjung Cho \it et al. %, %Haixun Wang, Philip S Yu. \rm Maple: A fast algorithm for maximal pattern-based clustering. In -\it Proc. ICDM 2003 Conf}., Florida, 2003, pp.259--266.

[4] Haixun Wang, Fang Chu, Wei Fan, Philip S Yu, Jian Pei. A fast algorithm for subspace clustering by pattern similarity. In -\it Proc. Scientific and Statistical Database Management Conference}, Santorini Island, Greece, 2004, pp.51--62.

[5] Lizhuang Zhao, Mohammed J Zaki. Tricluster: An effective algorithm for mining coherent clusters in 3d microarray data. In -\it Proc. SIGMOD 2005 Conference}, Maryland, USA, 2005, pp.51--62.

[6] Jinze Liu, Jiong Yang, Wei Wang. Biclustering in gene expression data by tendency. In -\it Proc. 3rd Int. IEEE Computer Society Computational Systems Bioinformatics Conf.}, Stanford, USA, 2004, pp.182--193.

[7] Selnur Erdal, Ozgur Ozturk, David L Armbruster \it et al. \rm A time series analysis of microarray data. In -\it Proc. 4th IEEE Int. Symp. Bioinformatics and Bioengineering Conference}, Taichung, 2004, pp.366--378.

[8] Daxin Jiang, Chun Tang, Aidong Zhang. Cluster analysis for gene expression data: A survey. -\it IEEE Trans. Knowl. Data Eng.}, 2004, 16(11): 1370--1386.

[9] Jason Ernst, Gerard J Nau, Ziv Bar-Joseph. Clustering short time series gene expression data. -\it Bioinformatics}, 2005, 21(Suppl): 159--168.

[10] Yizong Cheng, George M Church. Biclustering of expression data. In -\it Proc. 8th Int. Conf. Intelligent Systems for Molecular Biology 2000 Conference}, San Diego, USA, 2000, pp.93--103.

[11] Yu H, Luscombe N, Qian J, Gerstein M. Genomic analysis of gene expression relation-ships in transcriptional regulatory networks. -\it Trends Genet}, 2003, 19(8): 422--427.

[12] Zhang Y, Zha H, Chu C H. A time-series biclustering algorithm for revealing co-regulated genes. In -\it Proc. Int. Symp. Information and Technology: Coding and Computing, $($ITCC 2005$)$}, Las Vegas, USA, 2005, pp.32--37.

[13] Terry P Speed. Review of ``stochastic complexity in statistical inquiry''. -\it IEEE Trans. Information Theory}, 1991, 37(6): 1739--1746.

[14] Kesheng Wu, Ekow J. Otoo, Arie Shoshani. On the performance of bitmap indices for high cardinality attributes. In -\it Proc. VLDB 2004 Conference}, Canada, 2004, pp.24--35.

[15] Kesheng Wu, Ekow J. Otoo, Arie Shoshani. Compressing bitmap indexes for faster search operations. In -\it Proc. SSDBM 2002 Conference}, Scotland, UK, 2002, pp.99--108.

[16] Golub T R, Slonim D K, Tamayo P \it et al. \rm Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. -\it Science}, 1999, 286(5439): 531--537.

[17] Spellman P T, Sherlock G, Zhang M Q \it et al. \rm Comprehensive identification of cell cycle-regulated genes of the yeast sacccha-romyces cerevisiae by microarray hybridization. -\it Molecular Biology of the Cell}, 1998, 1(9):3273--3297.

[18] Levine E, Getz G, Domany E. Coupled two-way clustering analysis of gene microarray data. In -\it Proc. Natural Academy of Sciences US}, 2000, pp.12079--12084.
Full text



[1] Liu Mingye; Hong Enyu;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] Chen Shihua;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] Tang Tonggao; Zhao Zhaokeng;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] Min Yinghua;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] Zhu Hong;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
[10] Li Minghui;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved