.tabbox {width:400px; margin-top: 15px;margin-bottom: 5px} .tabmenu {width:400px;height:28px;border-left:1px solid #CCC;border-top:1px solid #ccc;} .tabmenu ul {margin:0;padding:0;list-style-type: none;} .tabmenu li { text-align:center; float:left; display:block; width:99px; overflow:hidden; background-color: #f1f1f1; line-height:27px; border-right:#ccc 1px solid; border-bottom:#ccc 1px solid; display:inline;} .tabmenu .cli {text-align:center;float:left;display:block;width:99px;overflow:hidden;background-color: #fff;line-height:27px;border-right:#ccc 1px solid;border-bottom:#fff 1px solid;display:inline; cursor:pointer; color: #810505; font-weight:bold} #tabcontent {width:399px;background-color:#fff;border-left:#CCC 1px solid;border-right:#CCC 1px solid;border-bottom:#CCC 1px solid; height:60px;} #tabcontent ul {margin:0;padding:5px;list-style-type: none;} #tabcontent .hidden {display:none;} Search Browse by Issue Fig/Tab Adv Search
 HOME ABOUT JCST AUTHORS REVIEWERS PUBLISHED PAPERS FORTHCOMING PAPERS

›› 2014,Vol. 29 ›› Issue (3): 423-435.

• Special Section on Selected Paper from NPC 2011 •

### 基于最优集合覆盖的基因表达独占行二聚类研究

Amichai Painsky and Saharon Rosset

1. School of Mathematical Sciences, Tel Aviv University, Tel Aviv 69978, Israel
• 收稿日期:2013-08-30 修回日期:2014-03-14 出版日期:2014-05-05 发布日期:2014-05-05
• 作者简介:Amichai Painsky received his B.Sc. degree in electrical engineering from Tel Aviv University (2007) and his M.Eng. degree in electrical engineering from Princeton University (2009). He is currently carrying a statistics Ph.D. at School of Mathematical Sciences, Tel Aviv University. His research interests include data mining, machine learning, statistical learning and their connection to information theory.
• 基金资助:

This research was funded in part by Israeli Science Foundation under Grant No. 1227/09 and by a grant to Amichai Painsky from the Israeli Center for Absorption in Science.

### Optimal Set Cover Formulation for Exclusive Row Biclustering of Gene Expression

Amichai Painsky and Saharon Rosset

1. School of Mathematical Sciences, Tel Aviv University, Tel Aviv 69978, Israel
• Received:2013-08-30 Revised:2014-03-14 Online:2014-05-05 Published:2014-05-05
• About author:Amichai Painsky received his B.Sc. degree in electrical engineering from Tel Aviv University (2007) and his M.Eng. degree in electrical engineering from Princeton University (2009). He is currently carrying a statistics Ph.D. at School of Mathematical Sciences, Tel Aviv University. His research interests include data mining, machine learning, statistical learning and their connection to information theory.
• Supported by:

This research was funded in part by Israeli Science Foundation under Grant No. 1227/09 and by a grant to Amichai Painsky from the Israeli Center for Absorption in Science.

Abstract: The availability of large microarray data has led to a growing interest in biclustering methods in the past decade. Several algorithms have been proposed to identify subsets of genes and conditions according to different similarity measures and under varying constraints. In this paper we focus on the exclusive row biclustering problem (also known as projected clustering) for gene expression, in which each row can only be a member of a single bicluster while columns can participate in multiple clusters. This type of biclustering may be adequate, for example, for clustering groups of cancer patients where each patient (row) is expected to be carrying only a single type of cancer, while each cancer type is associated with multiple (and possibly overlapping) genes (columns). We present a novel method to identify these exclusive row biclusters in the spirit of the optimal set cover problem. We present our algorithmic solution as a combination of existing biclustering algorithms and combinatorial auction techniques. Furthermore, we devise an approach for tuning the threshold of our algorithm based on comparison with a null model, inspired by the Gap statistic approach. We demonstrate our approach on both synthetic and real world gene expression data and show its power in identifying large span non-overlapping rows submatrices, while considering their unique nature.

 [1] Madeira S C, Oliveira A L. Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2004, 1(1): 2445.[2] Cheng Y, Church G M. Biclustering of expression data. In Proc. the 8th Int. Conf. Intelligent Systems for Molecular Biology, Aug. 2000, pp.93-103.[3] Yang J, Wang W, Wang H, Yu P S. Enhanced biclustering on expression data. In Proc. the 3rd IEEE Symposium on Bioinformatics and Bioengineering, Mar. 2000, pp.321-327.[4] Sheng Q, Moreau Y, De Moor B. Biclustering microarray data by Gibbs sampling. Bioinformatics, 2003, 19(suppl. 2): 196205.[5] Tang C, Zhang L, Zhang A, Ramanathan M. Interrelated twoway clustering: An unsupervised approach for gene expression data analysis. In Proc. the 2nd IEEE Int. Symposium on Bioinformatics and Bioengineering, Nov. 2001, pp.41-48.[6] Divina F, Aguilar-Ruize J. Biclustering of expression data with evolutionary computation. IEEE Transactions on Knowledge and Data Engineering, 2006, 18(5): 590-602.[7] Aggarwal C C, Procopiuc C, Wolf J L, Yu P S, Park J S. Fast algorithm for projected clustering. ACM SIGMOD Record, 1999, 28(2): 61-72.[8] Yip K Y, Cheng D W, Ng M K. HARP: A practical projected clustering algorithm. IEEE Transactions on Knowledge and Data Engineering, 2004, 16(11): 1387-1397.[9] Bouguessa M, Wang S. PCGEN: A practical approach to projected clustering and its application to gene expression data. In Proc. the IEEE Symposium on Computational Intelligence and Data Mining, April 2007, pp.661-667.[10] Tanay A, Sharan R, Shamir R. Discovering statistically significant biclusters in gene expression data. Bioinformatics, 2002, 18(suppl. 1): 136-144.[11] Ayadi W, Elloumi M, Hao J K. BicFinder: A biclustering algorithm for microarray data analysis. Knowledge and Information Systems, 2012, 30(2): 341-358.[12] Vuki?evi? M, Kirchner K, Delibaši? B, Jovanovi? M, Ruhland J, Suknovi?M. Finding best algorithmic components for clustering microarray data. Knowledge and Information Systems, 2013, 35(1): 111-130.[13] Leyton-Brown K. Resource allocation in competitive multiagent systems [Ph.D. Thesis]. Stanford University, 2003.[14] Rothkopf M, Pekec A, Harstad R. Computationally manageable combinatorial auctions. Management Science, 1998, 44(8): 1131-1147.[15] de Vries S, Vohra R. Combinatorial auctions: A survey. INFORMS Journal on Computing, 2003, 15(3): 284-309.[16] Nisan N. Bidding and allocation in combinatorial auctions. In Proc. the 2nd ACM Conference on Electronic Commerce, Oct. 2000, pp.1-12.[17] Tenenholtz M. Some tractable cominatorial auctions. In Proc. the AAAI/IAAI, Jul. 2000, pp.98-103.[18] Sandholm T. Algorithm for optimal winner determination in combinatorial auctions. Artificial Intelligence, 2002, 135(1/2): 1-54.[19] Tibshirani R, Walther G, Hastie T. Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2001, 63(2): 411-423.[20] Mohajer M, Englmeier K H, Schmid V J. A comparison of Gap statistic definitions with and without logarithm function. arXiv:1103.4767v1 [Stat ME], 2011.[21] Armstrong S A, Stauton J E, Silveman L B et al. MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nature Genetics, 2002, 30(1): 41-47.[22] Gordon G, Jensen R, Hsiao L et al. Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Research, 2002, 62(17): 4963-4967.[23] Hubert L, Araie P. Comparing partitions. Journal of Classificastion, 1985, 2(1): 193-218.[24] Duan K B, Rajapakse J C,Wang H, Azuaje F. Multiple SVMRFE for gene selection in cancer classification with expression data. IEEE Transactions on NanoBioscience, 2005, 4(3): 228-234.[25] Alba E, Garcia-Nieto J, Jourdan L, Talbi E G. Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms. In Proc. the IEEE Congress on Evolutionary Computation, Sept. 2007, pp.284-290.
 No related articles found!
Viewed
Full text

Abstract

Cited

Shared
Discussed