›› 2010, Vol. 25 ›› Issue (1): 95-106.

• Special Issue on Computational Challenges from Modern Molecular Biology • Previous Articles     Next Articles

Can We Determine a Protein Structure Quickly?

Ming Li (李明), Fellow, ACM, IEEE, Royal Society of Canada   

  1. D.R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario, N2L 3G1 Canada
    Dingsheng Technologies, Beijing 100085, China
  • Received:2009-10-13 Revised:2009-11-16 Online:2010-01-05 Published:2010-01-05
  • About author:
    Ming Li is a Canada research chair in bioinformatics and a University Professor at the University of Waterloo. He is a fellow of Royal Society of Canada, ACM, and IEEE. He is a recipient of E.W.R. Steacie Fellowship Award in 1996, and the 2001 Killam Fellowship. Together with Paul Vitanyi he has pioneered the applications of Kolmogorov complexity and co-authored the book ``An introduction to Kolmogorov complexity and its applications''. His research interests recently include protein structure determination and the Internet search engine.
  • Supported by:

    This work was partially supported by the National High Tech Research and Development 863 Program under Grant No. 2008AA02Z313 from China's Ministry of Science and Technology, Canada's NSERC under Grant No. OGP0046506, Canada Research Chair Program, an NSERC Collaborative Grant, and Ontario's Premier's Discovery Award.

Can we determine a high resolution protein structure quickly, say, in a week? I will show this is possible by the current technologies together with new computational tools discussed in this article. We have three potential paths to explore:





  • X-ray crystallography. While this method has produced the most protein structures in the PDB (Protein Data Bank), the nasty trial-and-error crystallization step remains to be an inhibitive obstacle.
  • NMR (Nuclear Magnetic Resonance) spectroscopy. While the NMR experiments are relatively easy to do, the interpretation of the NMR data for structure calculation takes several months on average.
  • In silico protein structure prediction. Can we actually predict high resolution structures consistently? If the predicted models remain to be labeled as ``predicted'', and these structures still need to be experimentally verified by the wet lab methods, then this method at best can serve only as a screening tool.
    I investigate the question of ``quick protein structure determination'' from a computer scientist point of view and actually answer the more relevant question ``what can a computer scientist effectively contribute to this goal''.






  • [1] Wooley J, Ye Y. A Historical Perspective and Overview of Protein Structure Prediction. Computational Methods for Protein Structure Prediction and Modeling, Xu Y et al. (eds.), Springer, 2007, pp.1-44.
    [2] Hiraki M et al. Development of an automated largescale protein-crystallization and monitoring system for highthroughput protein-structure analyses. Acta Crystallogr. D. Biol. Crystallogr., 2006, 62(9): 1058-1065.
    [3] Chandonia J M, Brenner S E. The impact of structural genomics: Expectations and outcomes. Science, Jan. 20, 2006, 311(5759): 347-351.
    [4] Hamelryck T, Kent J T, Krogh A. Sampling realistic protein conformations using local structural bias. PLoS Comput. Biol., 2006, 2(9): e131.
    [5] Kim D, Xu D, Guo J, Ellrott K, Xu Y. PROSPECT II: Protein structure prediction program for genome-scale applications. Protein Eng., 2003, 16(9): 641-650.
    [6] Bradley P, Misura K M S, Baker D. Toward high-resolution de novo structure prediction for small proteins. Science, 2005, 309(5742): 1868-1871.
    [7] Zhang Y, Arakaki A, Skolnick J. TASSER: An automated method for the prediction of protein tertiary structures in CASP6. Proteins, 2005, 61(S7): 91-98.
    [8] Zhang Y. Template-based modeling and free modeling by ITASSER in CASP7. Proteins, 2007, 69(Suppl. 8): 108-117.
    [9] Xu J, Li M, Kim D, Xu Y. RAPTOR: Optimal protein threading by linear programming. Journal of Bioinformatics and Computational Biology, 2003, 1(1): 95-117.
    [10] Zhang J, Wang Q, Barz B, He Z, Kosztin I, Shang Y, Xu D. MUFOLD: A new solution for protein 3D structure prediction. DOI: 10.1002/prot.22634, Proteins: Structure, Function and Bioinformatics, 2009, DOI:10.1002/prot.22634.
    [11] Li S C, Bu D, Xu J, Li M. Fragment-HMM: A new approach to protein structure prediction. Protein Science, 2008, 17: 1925-1934.
    [12] Li S C. New approaches to protein structure prediction
    [Ph.D. Dissertation]. University of Waterloo, Waterloo, Canada, 2009.
    [13] Li S C, Bu D B, Li M. ONION: Quality assessment of ab initio decoys. Manuscript, 2009.
    [14] Kurt W¨uthrich. NMR of Proteins and Nucleic Acids. John Wiley & Sons, 1986.
    [15] G¨untert P. Automated structure determination from NMR spectra. European Biophysics Journal, 2009, 38(2): 129-143.
    [16] Williamson M P, Craven C J. Automated protein structure calculation from NMR data. Journal of Biomolecular NMR, 2009, 43(3): 131-143.
    [17] Alipanahi B, Gao X, Karakoc E, Li S C, Bu D, Feng G, Donaldson L, Li M. An automated protocol for NMR protein structure determination, Manuscript, 2009.
    [18] Koradi R, Billeter M, Engeli M, G¨untert P, W¨uthrich K. Automated peak picking and peak integration in macromolecular NMR spectra using AUTOPSY. Journal of Magnetic Resonance, 1998, 135(2): 288-297.
    [19] Altieri A S, Byrd R A. Automation of NMR structure determination of proteins. Current Opinion in Structural Biology, 2004, 14(5): 547-553.
    [20] Corne S A, Johnson P. An artificial neural network for classifying cross peaks in two-dimensional NMR spectra. Journal of Magnetic Resonance, 1992, 100(2): 256-266.
    [21] Carrara E A, Pagliari F, Nicolini C. Neural networks for the peak-picking of nuclear magnetic resonance spectra. Neural Networks, 1993, 6(7): 1023-1032.
    [22] Rouh A, Louis-Joseph A, Lallemand J Y. Bayesian signal extraction from noisy FT NMR spectra. Journal of Biomolecular NMR, 1994, 4(4): 505-518.
    [23] Antz C, Neidig K P, Kalbitzer H R. A general Bayesian method for an automated signal class recognition in 2D NMR spectra combined with a multivariate discriminant analysis. Journal of Biomolecular NMR, 1995, 5(3): 287-296.
    [24] Orekhov V Y, Ibraghimov I V, Billeter M. MUNIN: A new approach to multi-dimensional NMR spectra interpretation. Journal of Biomolecular NMR, 2001, 20(1): 49-60.
    [25] Korzhnev D M, Ibraghimov I V, Billeter M, Orekhov V Y. MUNIN: Application of three-way decomposition to the analysis of heteronuclear NMR relaxation data. Journal of Biomolecular NMR, 2001, 21(3): 263-268.
    [26] Kleywegt G, Boelens R, Kaptein R. A versatile approach toward the partially automatic recognition of cross peaks in 2D 1H NMR spectra. Journal of Magnetic Resonance, 1990, 88(3): 601-608.
    [27] Garret D S, Powers R, Gronenborn A M, Clore G M. A common sense approach to peak picking in two-, three-, and four-dimensional spectra using automatic computer analysis of contour diagrams. Journal of Magnetic Resonance, 1991, 95: 214-220.
    [28] Johnson B A, Blevins R A. MR view: A computer program for the visualization and analysis of NMR data. Journal of Biomolecular NMR, 1994, 4(5): 603-614.
    [29] Herrmann T, G¨untert P, W¨uthrich K. Protein NMR structure determination with automated NOE-identification in the NOESY spectra using the new software ATNOS. Journal of Biomolecular NMR, 2002, 24(3): 171-189.
    [30] Goddard T D, Kneller D G. SPARKY 3. University of California, San Francisco, USA, 2008.
    [31] Alipanahi B, Gao X, Karakoc E, Donaldson L, Li M. PICKY: A novel SVD-based NMR spectra peak picking method. Bioinformatics, 2009, 25(12): i268-i275.
    [32] Bartels C, Billeter M, G¨untert P, W¨uthrich K. Automated sequence-specific NMR assignment of homologous proteins using the program GARANT. Journal of Biomolecular NMR, 1996, 7(3):207-213.
    [33] Zimmerman D E, Kulikowski C A, Huang Y, Feng W, Tashiro M, Shimotakahara S, Chien C, Powers R, Montelione G T. Automated analysis of protein NMR assignments using methods from artificial intelligence. Journal of Molecular Biology, 1997, 269(4): 592-610.
    [34] Gronwald W, Willard L, Jellard T, Boyko R F, Rajarathnam K, Wishart D S, S¨onnichsen F D, Sykes B D. Camra: Chemical shift based computer aided protein NMR assignments. Journal of Biomolecular NMR, 1998, 12(3): 395-405.
    [35] Bailey-Kellogg C, Widge A, Kelly J, Brushweller J, Donald B R. The NOESY jigsaw: Automated protein secondary structure and main-chain assignment from sparse, unassigned NMR data. Journal of Computational Biology, 2000, 7(3/4): 537-558.
    [36] G¨untert P, Salzmann M, Braun D, W¨uthrich K. Sequencespecific NMR assignment of proteins by global fragment mapping with the program MAPPER. Journal of Biomolecular NMR, 2000, 18(2): 129-137.
    [37] Hus J C, Prompers J, Br¨uschweiler R. Assignment strategy for proteins with known structure. Journal of Magnetic Resonance, 2002, 157(1): 119-123.
    [38] Erdmann M A, Rule G S. Rapid protein structure detection and assignment using residual dipolar couplings. Technical Report CMU-CS-02-195, School of Computer Science, Carnegie Mellon University, Pittsburgh, USA, 2002.
    [39] Pristovsek P, R¨uterjans H, Jerala R. Semiautomatic sequencespecific assignment of proteins based on the tertiary structure — The program st2nmr. Journal of Computational Chemistry, 2002, 23(3): 335-340.
    [40] Coggins B, Zhou P. PACES: Protein sequential assignment by computer-assisted exhaustive search. Journal of Biomolecular NMR, 2003, 26(2): 93-111.
    [41] Jung Y, Zweckstetter M. Mars—Robust automatic backbone assignment of proteins. Journal of Biomolecular NMR, 2004, 30(1): 11-23.
    [42] Langmead C J, Donald B R. An expectation/maximization nuclear vector replacement algorithm for automated NMR resonance assignments. Journal of Biomolecular NMR, 2004, 29(2): 111-138.
    [43] Langmead C J, Yan A, Lilien R, Wang L, Donald B R. A polynomial-time nuclear vector replacement algorithm for automated NMR resonance assignment. Journal of Computational Biology, 2004, 11(2/3): 277-298.
    [44] Masse J E, Keller R. Autolink: Automated sequential resonance assignment of biopolymers from NMR data by relativehypothesis- prioritization-based simulated logic. Journal of Magnetic Resonance, 2005, 174: 133-151.
    [45] Pristovsek P, Franzoni L. Stereospecific assignments of protein NMR resonances based on the tertiary structure and 2D/3D NOE data. Journal of Computational Chemistry, 2004, 27(6): 791-797.
    [46] Wu K, Chang J, Chen J, Chang C, Wu W, Huang T, Sung T, Hsu W. RIBRA: An error-tolerant algorithm for the NMR backbone assignment problem. Journal of Computational Biology, 2006, 13(2): 229-244.
    [47] Wan X, Lin G. CISA: Combined NMR resonance connectivity information determination and sequential assignment. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2007, 4(3): 336-348.
    [48] Lemak A, Steren C A, Arrowsmith C H, Llin´as, M. Sequence specific resonance assignment via Multicanonical Monte Carlo search using an ABACUS approach. Journal of Biomolecular NMR, 2008, 41(1): 29-41.
    [49] Volk J, Herrmann T, W¨uthrich K. Automated sequencespecific protein NMR assignment using the memetic algorithm MATCH. Journal of Biomolecular NMR, 2008, 41(3): 127- 138.
    [50] Xiong F, Bailey-Kellogg C. A hierarchical grow-and-match algorithm for backbone resonance assignments given 3D structure. In Proc. The 7th IEEE International Conference on Bioinformatics and Bioengineering, Boston, MA, Oct. 14–17, 2007, pp.403-410.
    [51] Xiong F, Pandurangan G, Bailey-KelloggC. Contact replacement for NMR resonance assignment. Bioinformatics, 2008, 24(13): i205-i213.
    [52] Fiorito F, Herrmann T, Damberger F F, W¨uthrich K. Automated amino acid side-chain NMR assignment of proteins using 13C and 15N-resolved 3D
    [1H,1 H]-NOESY. Journal of Biomolecular NMR, 2008, 42(1): 23-33.
    [53] Apaydin M S, Conitzer V, Donald B R. Structure-based protein NMR assignments using native structural ensembles. Journal of Biomolecular NMR, 2008, 40(4): 263-276.
    [54] Stratmann D, Heijenoort C, Guittet E. NOEnet — Use of NOE networks for NMR resonance assignment of proteins with known 3D structure. Bioinformatics, 2009, 25(4): 474- 481.
    [55] Alipanahi B, Gao X, Karakoc E, Balbach F, Donaldson L, Arrowsmith C, Li M. IPASS: Error tolerant NMR backbone resonance assignment by linear programming. Technical Report, No. CS-2009-16, 2009, University of Waterloo, http://www.cs.uwaterloo.ca/research/tr/2009/.
    [56] Seavey B R, Farr E A, Westler W M, Markley J. A relational database for sequence-specific protein NMR data. Journal of Biomolecular NMR, 1991, 1(3): 217-236.
    [57] Li S C, Bu D, Gao X, Xu J, Li M. Designing succinct structural alphabets. Bioinformatics, 2008, 24(13): i182-i189.
    [58] Shen Y, Lange O, Delaglio F, Rossi P, Aramini J M, Liu G, Eletsky A, Wu B, Singarapu K K, Lemak A, Ignatchenko A, Arrowsmith C, Szyperski T, Montelione G T, Baker D, Bax A. Consistent blind protein structure generation from NMR chemical shift data. Proc. the National Academy of Sciences, 2008, 105(12): 4685-4690.
    [59] Gao X. Towards automating protein structure determination from NMR data
    [Ph.D. Dissertation]. University of Waterloo, Waterloo, Canada, 2009.
    [60] Jang R, Gao X, Li M. Towards automated structure-based NMR assignment. Manuscript, 2009.
    [61] Zhao Y, Alipanahi B, Li S C, Li M. Protein secondary structure prediction using NMR chemical shift data. Manuscript, 2009.
    [62] Mobli M, Maciejewski M W, Gryk M R, Hoch J C. Au automated tool for maximum entropy reconstruction of biomolecular NMR spectra. Nature Methods, 2007, 4(6): 467-468.
    [63] Maciejewski M W, Qui H Z, Rujan I, Mobli M, Hoch J C. Nonuniform sampling and spectral aliasing. Journal of Magnetic Resonance, 2009, 199(1): 88-93.
    [64] Xu R, Ayers B, Cowburn D, Muir T W. Chemical ligation of folded recombinant proteins: Segmental isotopic labeling of domains for NMR studies. Proc. Natl. Acad. Sci. USA, 1999, 96(2): 388-393.
    [65] Yu H. Extending the size limit of protein nuclear magnetic resonance. Proc. Natl. Acad. Sci. USA, 1999, 96(2): 332- 334.
    [66] Ozawa K, Wu P S C, Dixon N E, Otting G. 15N-labelled proteins by cell-free protein synthesis — Strategies for highthroughput NMR studies of proteins and protein-ligand complexes. The FEBS Journal, 2006, 273(18): 4154-4159.
    [67] Torizawa T, Ono A M, Terauchi T, Kainosho M. NMR assignment methods for the aromatic ring resonances of phenylalanine and tyrosine residues in proteins. J. Am. Chem. Soc., 2005, 127(36): 12620-12626.
    [68] Kainosho M, Trizawa T, Ono A M, Guntert P. Optimal isotope labelling for NMR protein structure determination. Nature, 2006, 440: 52-57.

    No related articles found!
    Full text



    [1] A.Corradi; L.Leonardi;. PROM:A Support for Robust Replication in a Distributed Object Environment[J]. , 1990, 5(2): 139 -155 .
    [2] Wang Hui; Liu Dayou; Wang Yafei;. Sequential Back-Propagation[J]. , 1994, 9(3): 252 -260 .
    [3] Chen Yangjun;. Graph Traversal and Top-Down Evaluation of Logic Queries[J]. , 1998, 13(4): 300 -316 .
    [4] MA Zongmin; ZHANG W. J; MA W. Y;. Extending the Relational Model to Deal with Probabilistic Data[J]. , 2000, 15(3): 230 -240 .
    [5] Joonghyun Ryu, Rhohun Park, and Deok-Soo Kim. Connolly Surface on an Atomic Structure via Voronoi Diagram of Atoms[J]. , 2006, 21(2): 255 -260 .
    [6] Li-Guo Yu and Srini Ramaswamy. Component Dependency in Object-Oriented Software[J]. , 2007, 22(3): 379 -386 .
    [7] Fatemeh Azmandian, Ayse Yilmazer, Jennifer G. Dy Javed A. Aslam, and David R. Kaeli. Harnessing the Power of GPUs to Speed Up Feature Selection for Outlier Detection[J]. , 2014, 29(3): 408 -422 .
    [8] Po Hu, Min-Lie Huang, and Xiao-Yan Zhu. Exploring the Interactions of Storylines from Informative News Events[J]. , 2014, 29(3): 502 -518 .
    [9] Cheng-Lin Fan, Jun Luo, Wen-Cheng Wang, Fa-Rong Zhong, Binhai Zhu. On some proximity problems of colored sets[J]. , 2014, 29(5): 879 -886 .
    [10] Yang Liu, Xuan-Dong Li, Yan Ma. A Game-Based Approach for PCTL* Stochastic Model Checking with Evidence[J]. , 2016, 31(1): 198 -216 .

    ISSN 1000-9000(Print)

    CN 11-2296/TP

    Editorial Board
    Author Guidelines
    Journal of Computer Science and Technology
    Institute of Computing Technology, Chinese Academy of Sciences
    P.O. Box 2704, Beijing 100190 P.R. China
    E-mail: jcst@ict.ac.cn
      Copyright ©2015 JCST, All Rights Reserved