›› 2010, Vol. 25 ›› Issue (1): 3-9.

• Special Issue on Computational Challenges from Modern Molecular Biology • Previous Articles     Next Articles

New Generations: Sequencing Machines and Their Computational Challenges

David C. Schwartz1 and Michael S. Waterman2,3   

  1. 1Laboratory for Molecular and Computational Genomics, Department of Chemistry and Laboratory of Genetics, University of Wisconsin-Madison, WI 53706, U.S.A.
    2Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089-2910, U.S.A.
    3Department of Automation, Tsinghua University, Beijing 100084, China
  • Received:2009-09-05 Revised:2009-11-24 Online:2010-01-05 Published:2010-01-05
  • About author:
    David C. Schwartz has been a professor of chemistry, genetics, and biotechnology at the University of Wisconsin-Madison since 1999. Previous faculty appointments were at New York University and The Carnegie Institution of Washington-Baltimore. He works at the interface between nanotechnology and genomic science through his invention of single molecule platforms for genome analysis. As a graduate student, he invented Pulsed Field Gel Electrophoresis and continues to train students in the science of how to conceive and develop new tools for biological investigation as the director of a training program.
    Michael S. Waterman is university professor at the University of Southern California where he has been a faculty member since 1982. Prior to that he held positions at Los Alamos National Laboratory and Idaho State University. He also currently holds a chair professor team at Tsinghua University. Professor Waterman works in the area of computational biology, particularly on the analysis DNA, RNA and protein sequence data. He is the co-developer of the Smith-Waterman algorithm for sequence comparison and of the Lander-Waterman formula for physical mapping. He is a member of the U.S. National Academy of Sciences and of the French Académie des Sciences.
  • Supported by:

    This work is supported by NIH under Grant No. R01 HG000225 (DCS) and NSF of USA under Grant No. DBI-0501818 (DCS).

New generation sequencing systems are changing how molecular biology is practiced. The widely promoted $1000 genome will be a reality with attendant changes for healthcare, including personalized medicine. More broadly the genomes of many new organisms with large samplings from populations will be commonplace. What is less appreciated is the explosive demands on computation, both for CPU cycles and storage as well as the need for new computational methods. In this article we will survey some of these developments and demands.

[1] Harris T D, Buzby P R, Babcock H, Beer E, Bowers J, Braslavsky I, Causey M, Colonell J, Dimeo J, Efcavitch J W, Giladi E, Gill J, Healy J, Jarosz M, Lapen D, Moulton K, Quake S R, Steinmann K, Thayer E, Tyurina A, Ward R, Weiss H, Xie Z. Single-molecule DNA sequencing of a viral genome. Science, 2008, 320(5872): 106-109.
[2] Fuller C W, Middendorf L R, Benner S A, Church G M, Harris T, Huang X, Jovanovich S B, Nelson J R, Schloss J A, Schwartz D C, Vezenov D V. The challenges of sequencing by synthesis. Nat. Biotechnol., 2009, 27(11): 1013-1023.
[3] Pemov A, Modi H, Chandler D P, Bavykin S. DNA analysis with multiplex microarray-enhanced PCR. Nucleic Acids Res., 2005, 33(2): e11.
[4] Ronaghi M, Uhlen M, Nyren P. A sequencing method based on real-time pyrophosphate. Science, 1998, 281(5375): 363- 365.
[5] Ronaghi M, Karamohamed S, Pettersson B, Uhlen M, Nyren P. Real-time DNA sequencing using detection of pyrophosphate release. Anal. Biochem., 1996, 242(1): 84-89.
[6] Bentley D R. Whole-genome re-sequencing. Curr. Opin. Genet. Dev., 2006, 16(6): 545-552.
[7] Eid J, Fehr A, Gray J et al. Real-time DNA sequencing from single polymerase molecules. Science, 2009, 323(5910): 133- 138.
[8] Kasianowicz J J, Brandin E, Branton D, Deamer D W. Characterization of individual polynucleotide molecules using a membrane channel. Proc. Natl. Acad. Sci. USA, 1996, 93(24): 13770-13773.
[9] Astier Y, Braha O, Bayley H. Toward single molecule DNA sequencing: Direct identification of ribonucleoside and deoxyribonucleoside 5_-monophosphates by using an engineered protein nanopore equipped with a molecular adapter. J. Am. Chem. Soc., 2006, 128(5): 1705-1710.
[10] Branton D, Deamer D W, Marziali A et al. The potential and challenges of nanopore sequencing. Nat. Biotechnol., 2008, 26(10): 1146-1153.
[11] Sigalov G, Comer J, Timp G, Aksimentiev A. Detection of DNA sequences using an alternating electric field in a nanopore capacitor. Nano. Lett., 2008, 8(1): 56-63.
[12] Jett J H, Keller R A, Martin J C, Marrone B L, Moyzis R K, Ratliff R L, Seitzinger N K, Shera E B, Stewart C C. Highspeed DNA sequencing: An approach based upon fluorescence detection of single molecules. J. Biomol. Struct. Dyn., 1989, 7(2): 301-309.
[13] Clarke J, Wu H C, Jayasinghe L, Patel A, Reid S, Bayley H. Continuous base identification for single-molecule nanopore DNA sequencing. Nat. Nanotechnol., 2009, 4(4): 265-270.
[14] Zhang M Q, Smith A D. Challenges in understanding genomewide DNA methylation. J. Comput. Sci. & Technol., 2010, 25(1): 26-34.
[15] Morozova O, Marra M. Applications of next-generation sequencing technologies in functional genomics. Genomics, 2008, 92(5): 255-264.
[16] Chen Y, Souaiaia T, Chen T. PerM: Efficient mapping of short sequencing reads with periodic full sensitive spaced seeds. Bioinformatics, 2009, 25 (19): 2514-2521.
[17] Staden R. A strategy of DNA sequencing employing computer programs. Nucleic Acids Res., June 11, 1979, 6(7): 2601-2610.
[18] Gingeras T R, Milazzo J P, Sciaky D, Roberts R J. Computer programs for the assembly of DNA sequences. Nucleic Acids Res., September 25, 1979, 7(9): 529-545.
[19] Gallant J, Maier D, Storer J. On finding minimal length superstrings. J. Computer System Sci., 1980, 20(1): 50-58.
[20] Waterman M S. Introduction to Computational Biology. Chapman & Hall, 1995.
[21] Kececioglu J D, Myers E W. Combinatiorial algorithms for DNA sequence assembly. Algorithmica, 1995, 13(1/2): 7-51.
[22] Kececioglu J D. Exact and approximation algorithms for DNA sequence reconstruction
[Ph.D. Dissertation]. University of Arizona, Tucson, USA, 1992.
[23] Myers E W. Toward simplifying and accurately formulating fragment assembly. Journal of Computational Biology, 1995, 2(2): 275-290.
[24] Myers E S. The fragment assembly string graph. Bioinformatics, 2005, 21(Suppl. 2): ii79-ii85.
[25] Venter J C, Adams M D, Myers E W et al. The sequence of the human genome. Science, 2001, 291: 1304-1351.
[26] Lander E S, Linton L M, Birren B et al. Initial sequencing and analysis of the human genome. Nature, 2001, 409(6822): 860-921.
[27] Istrail S, Sutton G, Florea L et al. Whole genome shotgun assembly and comparison of human genome assemblies. Proc. Natl. Acad. Sci. USA, 2003, 101(7): 1916-1921.
[28] Idury R, Waterman M S. A new algorithm for DNA sequence. J. Comput. Biol., 1995, 2(2): 291-306.
[29] Chaisson M J, Pevzner P A. Short read fragment assembly of bacterial genomes. Genome Res., 2008, 18(2): 324-330.
[30] Chaisson M J, Tang H, Pevzner P A. Fragment assembly with short reads. Bioinformatics, 2004, 20(13): 2067-2074.
[31] Myers E W. The fragment assembly string graph. Bioinformatics, 2005, 21(Suppl. 2): ii79-ii85, doi:10.1093/ bioinformatics/ bti7114.
[32] Pevzner P A, Tang H. Fragment assembly with doublebarreled data. Bioinformatics, 2001, 17(Suppl. 1): S225- S233.
[33] Pevzner P A, Tang H, Waterman M S. A Eulerian path approach to DNA fragment assembly. Proc. Natl. Acad. Sci. USA, 2001, 98(17): 9748-9753.
[34] Zerbino D R, Birney E. Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res., 2008, 18(5): 821-829.
[35] Church D M, Goodstadt L, Hillier L W et al. Lineage-specific biology revealed by a finished genome assembly of the mouse. PLoS Biol., 2009, 7(5): e1000112.
[36] Valouev A, Schwartz D C, Zhou S, Waterman M S. An algorithm for assembly of ordered restriction maps from single DNA molecules. Proc. Natl. Acad. Sci. USA, 2006, 103(43): 15770-15775.
[37] Nagarajan N, Read T D, Pop M. Scaffolding and validation of bacterial genome assemblies using optical restriction maps. Bioinformatics, 2008, 24(10): 1229-1235.
[38] Schwartz D C, Li X, Hernandez L, Ramnarain S P, Huff E J, Wang Y K. Ordered restriction maps of Saccharomyces cerevisiae chromosomes constructed by optical mapping. Science, 1993, 262(5130): 110-114.
[39] Zhou S, Herschleb J, Schwartz D C. A Single Molecule System forWhole Genome Analysis. New Methods for DNA Sequencing, Mitchelson K R (ed)., Amsterdam: Elsevier, 2007.
[40] Dimalanta E T, Lim A, Runnheim R et al. A microfluidic system for large DNA molecule arrays. Anal. Chem., 2004, 76(18): 5293-5301.
[41] Valouev A, Zhang Y, Schwartz D C, Waterman M S. Refinement of optical map assemblies (original paper). Bioinformatics, 2006, 22(10): 1217-1224.
[42] Valouev A, Li L, Liu Y C, Schwartz D C, Yang Y, Zhang Y, Waterman M S. Alignment of optical maps. J. Comput. Biol., 2006, 13(2): 442-462.
[43] Jo K, Dhingra D M, Odijk T et al. A single-molecule barcoding system using nanoslits for DNA analysis. Proc. Natl. Acad. Sci. USA, 2007, 104(8): 2673-2678.
[44] Ramanathan A, Pape L, Schwartz D C. High-density polymerase-mediated incorporation of fluorochrome-labeled nucleotides. Analytical Biochemistry, 2005, 337(1): 1-11.
[45] Ramanathan A, Huff E J, Lamers C C, Potamousis K D. Forrest D K, Schwartz D C. An integrative approach for the optical sequencing of single DNA molecules. Analytical Biochemistry, 2004, 330(2): 227-241.
[46] Zhou S, Pape L, Schwartz D C. Optical Sequencing: Acquisition from Mapped Single Molecule Templates. Next Generation Genome Sequencing: Towards Personalized Medicine, Janitz M (ed.), 2008, Weinheim: Wiley-VCH Verlag & Co., pp.133-149.
[47] Aguilera A, Gomez-Gonzalez B. Genome instability: A mechanistic view of its causes and consequences. Nat. Rev. Genet., 2008, 9(3): 204-217.

No related articles found!
Full text



[1] Cai Shijie; Zhang Fuyan;. A Fast Algorithm for Polygon Operations[J]. , 1991, 6(1): 91 -96 .
[2] Shen Yidong;. Form alizing Incomplete Knowledge in Incomplete Databases[J]. , 1992, 7(4): 295 -304 .
[3] Yu Shengke;. Reasoning in H-Net: A Unified Approach to Intelligent Hypermedia Systems[J]. , 1996, 11(1): 83 -89 .
[4] Tian Zengping; Wang Yujun; Qu Yunyao; Shi Baile;. On the Expressive Power of F-Logic Language[J]. , 1997, 12(6): 510 -519 .
[5] WU Jinzhao; LIU Zhuojun;. Linear Strategy for Boolean Ring Based Theorem Proving[J]. , 2000, 15(3): 271 -279 .
[6] Jun-Hao Zheng, Lei Deng, Peng Zhang, and Don Xie. An Efficient VLSI Architecture for Motion Compensation of AVS HDTV Decoder[J]. , 2006, 21(3): 370 -377 .
[7] Hao Lang, Bin Wang, Gareth Jones, Jin-Tao Li, Fan Ding, and Yi-Xuan Liu. Query Performance Prediction for Information Retrieval Based on Covering Topic Score[J]. , 2008, 23(4 ): 590 -601 .
[8] Yan-Hui Ding(丁艳辉), Member, CCF, Qing-Zhong Li(李庆忠), Senior Member, CCF Yong-Quan Dong(董永权), Member, CCF, and Zhao-Hui Peng(彭朝晖), Member, CCF. 2D Correlative-Chain Conditional Random Fields for Semantic Annotation of Web Objects[J]. , 2010, 25(4): 761 -770 .
[9] Javier Tejada-Cárcamo, Hiram Calvo, Alexander Gelbukh, and Kazuo Hara. Unsupervised WSD by Finding the Predominant Sense Using Context as a Dynamic Thesaurus[J]. , 2010, 25(5): 1030 -1039 .
[10] Jun Hu (胡军), Bing Wang (王兵), Yu Liu (刘禹), Member, CCF, and De-Yi Li (李德毅), Fellow, CCF. Personalized Tag Recommendation Using Social Influence[J]. , 2012, 27(3): 527 -540 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved