Predicting RNA Secondary Structure Using Profile Stochastic Context-Free Grammars and Phylogenic Analysis
-
Abstract
Stochastic context-free grammars (SCFGs) have beenapplied to predicting RNA secondary structure. The prediction of RNAsecondary structure can be facilitated by incorporating withcomparative sequence analysis. However, most of existing SCFG-basedmethods lack explicit phylogenic analysis of homologous RNAsequences, which is probably the reason why these methods are not idealin practical application. Hence, we present a new SCFG-based method byintegrating phylogenic analysis with the newly defined profile SCFG.The method can be summarized as: 1) we define a new profile SCFG,M, to depict consensus secondary structure of multiple RNAsequence alignment; 2) we introduce two distinct hidden Markov models,\la and \la', to perform phylogenic analysis ofhomologous RNA sequences. Here, \la is for non-structuralregions of the sequence and \la' is for structural regions ofthe sequence; 3) we merge \la and \la' into M to devise a combinedmodel for prediction of RNA secondary structure. We tested our method ondata sets constructed from the Rfam database. The\it sensitivity and \it specificity of our method are more accurate thanthose of the predictions by Pfold.
-
-