Journal of Computer Science and Technology  2010, 25(1) 131-153 DOI:     ISSN: 1000-9000 CN: CN 11-2296/TP

本期目录 | 过刊浏览 | 高级检索                                                            [打印本页]   [关闭]
扩展功能
本文信息
Supporting info
PDF(1174KB)
服务与反馈
把本文推荐给朋友
加入我的书架
加入引用管理器
引用本文
Email Alert
文章反馈
浏览反馈信息
本文关键词相关文章
gene networks
state space models
time-course gene expression data
本文作者相关文章
Rui Yamaguchi
Seiya Imoto
Satoru Miyano
PubMed
Article by Rui Yamaguchi
Article by Seiya Imoto
Article by Satoru Miyano
中文题目: 基于网络的生物状态空间模型的预测和模拟:药物作用模式搜索
中文导读: 时间表达谱数据的分析因其样本点少维度高而很难分析。本文作者提出一种新的生物状态空间模型来从芯片数据逆向生成基因调控网。

时间表达谱分析与其它领域的时间数据的有很大的区别。在其它领域,时间数据通常有很长的记录(很多的样本点),很低的维度(通常就只有一个变量),并且记录点之间的时间是不变的。 生物学需要同时研究成千上万的变量(基因),由于实验的成本却通常只能采到很少几个时间点(<10)。 因此可以用状态空间模型来降维。状态空间模型假设一个p维的时间序列数据是由一个k维(k<
作者提出的生物状态空间模型BSSMBasic 包括两部分,一部分是隐藏变量的自回归模型,另一部分是观测值相对于隐藏变量的线性回归模型。为了保证解的唯一性,还需引入一些约束。原来的状态空间模型和新加的约束可以合并成一个向量自回归模型。通过矩阵运算,可以把原来p维的数据投影到一个由k个模块构成的系统。为了处理实验重复数据,作者对每个重复单独使用原来的模型,这个改进的模型被称为BSSMRepli。为了防止过拟合,作者加入了一型范的惩罚函数,修改后的模型称为BSSMVAR。

Em算法被用来推断回归参数和k维动态系统每个模块的值,其似然函数和状态由卡尔曼滤波器估计。作者使用贝叶斯信息准则和秩分析来选择最优的模型。如果重复数据存在,在重复数据上的预测错误被用来选择模型。

为验证方法的性能,作者在酵母细胞周期数据,由于表皮生长因子缺乏导致的血管内皮细胞凋亡数据,和gefitinib刺激小气道上皮细胞数据上测试了他们的方法。在细胞周期数据中,BSSM找出的模块中基因在表达上和功能上的一致性都很高。与Beal的VBSSM比较显示作者的方法能发掘更清晰的调控模式,BSSM鉴定的网络与现有知识的吻合度也很高。

未来值得研究的方向包括:1)更稳定的参数训练方法,更有效的模型选择方法,或者通过贝叶斯方法引入更多的生物学知识。2)处理更高的维度。3)非线性模型。4)与网络分析结合起来。5)将统计分析与细胞网络的模拟结合起来。

Network-Based Predictions and Simulations by Biological State Space Models: Search for Drug Mode of Action

Rui Yamaguchi, Seiya Imoto, and Satoru Miyano

Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan

Abstract:

Since time-course microarray data are short but contain a large number of genes, most of statistical models should be extended so that they can handle such statistically irregular situations. We introduce biological state space models that are established as suitable computational models for constructing gene networks from microarray gene expression data. This chapter elucidates theory and methodology of our biological state space models together with some representative analyses including discovery of drug mode of action. Through the applications we show the whole strategy of biological state space model analysis involving experimental design of time-course data, model building and analysis of the estimated networks.

Keywords: gene networks    state space models    time-course gene expression data  
收稿日期 2009-09-30 修回日期 2009-11-21 出版日期  
DOI:
基金项目:

作者简介:
Rui Yamaguchi is a lecturer of Human Genome Center, Institute of Medical Science, University of Tokyo. He received his Ph.D. degree in science from the Kyushu University in 2003. His current research interests cover high dimensional time-course gene expression data analysis by state space models, biological pathway analysis and data assimilation, and sequence analysis for next generation sequencing data.
Seiya Imoto is currently an associate professor of Human Genome Center, Institute of Medical Science, University of Tokyo. He received the B.S., M.S., and Ph.D. degrees in mathematics from Kyushu University, Japan, in 1996, 1998 and 2001, respectively. His current research interests cover statistical analysis of high dimensional data by Bayesian approach, biomedical information analysis, microarray gene expression data analysis, gene network estimation and analy-sis, data assimilation in biological networks and computational drug target discovery.
Satoru Miyano is a professor of Human Genome Center, Institute of Medical Science, University of Tokyo. He received the B.S., M.S. and Ph.D. degrees all in mathematics from Kyushu University, Japan, in 1977, 1979 and 1984, respectively. His research group is developing computational methods for inferring gene networks from microarray gene expression data and other biological data, e.g., protein-protein interactions, promoter sequences. The group also developed a software tool, Cell Illustrator, for modeling and simulation of various biological systems. Currently, his research group is intensively working for developing the molecular network model of lung cancer by time-course gene expression and proteome data. With these technical achievements, his research direction is now heading toward a creation of Systems Pharmacology.

参考文献:

[1] Spellman P T, Sherlock G, Zhang M Q, Iyer V R, Anders K, Eisen M B, Brown P O, Botstien D, Futcher B. Comprehensive identification of cell cycle-regulated genes of the Yeast Saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell, 1998, 9(12): 3273-3297.
[2] Friedman N, Linial M, Nachman I, Pe’er D. Using Bayesian network to analyze expression data. J. Comp. Biol., 2000, 7(3/4): 601-620.
[3] Imoto S, Goto T, Miyano S. Estimation of genetic networks and functional structures between genes by using Bayesian network and nonparametric regression. Pacific Symposium on Biocomputing, 2002, 7: 175-186.
[4] Kim S, Imoto S, Miyano S. Dynamic Bayesian network and nonparametric regression for nonlinear modeling of gene networks from time series gene expression data. Biosystems, 2004, 75(1-3): 57-65.
[5] Murphy K, Mian S. Modelling gene expression data using dynamic Bayesian networks. Technical Report, Computer Science Division, University of California, Berkeley, USA, 1999.
[6] Basso K, Margolin A A, Stolovitzky G, Klein U, Dalla-Favera R, Califano A. Reverse engineering of regulatory networks in human B cells. Nat. Genet., 2005, 3(4): 382-390.
[7] Kitagawa G, Gersch W. Smoothness Priors Analysis of Time Series. New York: Springer-Verlag, 1996.
[8] West M, Harrison J. Bayesian Forecasting and Dynamic Models. Second Edition, New York: Springer-Verlag, 1997.
[9] Hirose O, Yoshida R, Imoto S, Yamaguchi R, Higuchi T, Charnock-Jones S D, Print C, Miyano S. Statistical inference of transcriptional module-based gene networks from time course gene expression profiles by using state space models. Bioinformatics, 2008, 24(7): 932-942.
[10] Yoshida R, Imoto S, Higuchi T. Estimating time-dependent gene networks from time series microarray data by dynamic linear models with Markov switching. In Proc. IEEE Computational Systems Bioinformatics Conference, Stanford, USA, Aug. 8-11, 2005, pp.289-298.
[11] Kojima K, Yamaguchi R, Imoto S, Yamauchi M, Nagasaki M, Yoshida R Shimamura T, Ueno K, Higuchi T, Gotoh N, Miyano S. A state space representation of VAR models with sparse learning for dynamic gene networks. Genome Informatics, 2009, 22: 56-58.
[12] Shumway R H, Stoffer D S. An approach to time series smoothing and forecasting using the EM algorithm. J. Time Series Analysis, 1982, 3(4): 253-264.
[13] Shumway R H. Dynamic mixed models for irregularly observed time series. Resenhas-Reviews of the Institute of Mathematics and Statistics, University of Sao Paulo, Brazil: USP Press, 2000, 4(4): 433-456.
[14] Kalman R E. A new approach to linear filtering and prediction problems. Trans. Amer. Soc. Mech. Eng., J. Basic Engineering, 1960, 82: 35-45.
[15] Yamaguchi R, Yoshida R, Imoto S, Higuchi T, Miyano S. Finding module-based gene networks with state-space models — Mining high-dimensional and short time-course gene expression data. IEEE Signal Processing Magazine, 2007, 24(1): 37-46.
[16] Shimamura T, Yamaguchi R, Imoto S, Miyano S. Weighted lasso in graphical Gaussian modeling for large gene network estimation based on microarray data. Genome Informatics, 2007, 19: 142-153.
[17] Efron B, Hastie T, Johnstone J, Tibshirani R. Least angle regression. Annals of Statistics, 2004, 32(2): 407-499.
[18] Li Z, Shaw S M, Yedwabnick M J, Chan C. Using a state-space model with hidden variables to infer transcription factor activities. Bioinformatics, 2006, 22(6): 747-754.
[19] Wu F X, Zhang A J, Kusalik A J. Modeling gene expression from microarray expression data with state-space equations. Pacific Symposium on Biocomputing, 2004, 9: 581-592.
[20] Rangel C, Angus J, Ghahramani Z, Lioumi M, Sotheran E, Gaiba A, Wild D L, Falciani F. Modeling T-cell activation using gene expression profiling and state-space models. Bioinformatics, 2004, 20(9): 1361-1372.
[21] Beal M J, Falciani F, Ghahramani Z, Rangel C, Wild D L. A Bayesian approach to reconstructing genetic regulatory networks with hidden factors. Bioinformatics, 2005, 21(3): 349- 356.
[22] Boyle E I, Weng S, Gollub J, Jin H, Botstein D, Cherry J M, Sherlock G. GO::TermFinder—Open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics, 2004, 20(18): 3710-3715.
[23] AffaraM, Dunmore B, Savoie C, Imoto S, Tamada Y, Araki H, Charnock-Jones D S, Miyano S, Print C. Understanding endothelial cell apoptosis: What can the transcriptome glycome and proteome reveal? Philosophical Transactions of Royal Society, 2007, 62(1484): 1469-1487.
[24] Johnson N A, Sengupta S, Saidi S A, Lessan K, Charnock- Jones S D, Scott L, Stephens R, Freeman T C, Tom B D, Harris M, Denyer G, Sundaram M, Sasisekharan R, Smith S K, Print C G. Endothelial cells preparing to die by apoptosis initiate a program of transcriptome and glycome regulation. FASEB J., 2003, 18(1): 188-190.
[25] Carmeliet P. Mechanisms of angiogenesis and arteriogenesis. Nature Medicine, 2000, 6(4): 389-395.
[26] Gerver H P, Hillan K J, Ryan A M, Kowalski J, Keller G A, Rangell L, Wright B D, Radtke F, Aguet M, Ferrara N. VEGF is required for growth and survival in neonatal mice. Development, 1999, 126(6): 1149-1159.
[27] Silverman B W. Density Estimation for Statistics and Data Analysis. London: Chapman & Hall, 1986.
[28] Aggarwal B B. Tumor necrosis factors receptor associated signaling molecules and their role in activation of apoptosis, JNK and NF-κB. Ann. Rheum. Dis., 2000, 59(Suppl. I): i6-i16.
[29] Keifer J A, Guttridge D C, Ashburner B P, Baldwin A S Jr. Inhibition of NF-κB activity by thalidomide through suppression of IκB kinase activity. J. Biol. Chem., 2001, 276(25): 22382-22387.
[30] Schwenzer R. The human tumor necrosis factor (TNF) receptor-associated factor 1 gene (TRAF1) is up-regulated by cytokines of the TNF ligand family and modulates TNFinduced activation of NF-κB and c-Jun N-terminal kinase. J. Biol. Chem., 1999, 274(27): 19368-19374.
[31] Han Y, Weinman S, Boldogh I, Walker R K, Brasier A R. Tumor necrosis factor-α-inducible IκBα proteolysis mediated by cytosolic m-calpain. A mechanism parallel to the ubiquitinproteasome pathway for nuclear factor-κB activation. J. Biol. Chem., 1999, 274(2): 787-794.
[32] Mukherji M, Bell R, Supekova L, Wang Y, Orth A P, Batalov S, Miraglia L, Huesken D, Lange J, Martin C, Sahasrabudhe S, Reinhardt M, Natt F, Hall J, Mickanin C, Labow M, Chanda S K, Cho C Y, Schultz P G. Genome-wide functional analysis of human cell-cycle regulators. Proc. Natl. Acad. Sci. USA, 2006, 103(40): 14819-14824.
[33] Yamaguchi R, Imoto S, Yamauchi M, Nagasaki M, Yoshida R, Shimamura T, Hatanaka Y, Ueno K, Higuchi T, Gotoh N, Miyano S. Predicting differences in gene regulatory systems by state space models. Genome Informatics, 2008, 21: 101-113.
[34] Gupta P K, Yoshida R, Imoto S, Yamaguchi R, Miyano S. Statistical absolute evaluation of gene ontology terms with gene expression data. In Proc. the 3rd Int. Symp. Bioinformatics Research and Applications, Atlanta, USA, May 7-10, 2007, LNCS 4463, Springer, Berlin/Heidelberg, pp.146-157.
[35] Yamaguchi R, Yamamoto M, Imoto S, Nagasaki M, Yoshida R, Tsuiji K, Ishige A, Asou H, Watanabe K, Miyano S. Identification of activated transcription factors from microarray gene expression data of kampo medicine-treated mice. Genome Informatics, 2007, 18: 119-129.
[36] Tamada Y, Imoto S, Araki H, Nagasaki M, Print C, Charnock- Jones D S, Miyano S. Estimating genome-wide gene networks using nonparametric Bayesian network models on massively parallel computers. IEEE/ACM Trans. Computational Biology and Bioinformatics. (in Press)
[37] Nagasaki M, Yamaguchi R, Yoshida R, Imoto S, Doi A, Tamada Y, Matsuno H, Miyan S, Higuchi T. Genomic data assimilation for estimating hybrid functional petri net from time-course gene expression data. Genome Informatics, 2006, 17(1): 46-61.
[38] Cell Illustrator. http://www.cellillustrator.com/, Oct. 1, 2009.
[39] Nagasaki M, Doi A, Matsuno H, Miyano S. Genomic object net: I. a platform for modeling and simulating biopathways. Applied Bioinformatics, 2003, 2(3): 181-184.

文章评论

Copyright 2008 by Journal of Computer Science and Technology