Journal of Computer Science and Technology  2010, 25(1) 71-81 DOI:     ISSN: 1000-9000 CN: CN 11-2296/TP

Current Issue | Archive | Search                                                            [Print]   [Close]
Survey
Information and Service
This Article
Supporting info
PDF(227KB)
Reference
Service and feedback
Email this article to a colleague
Add to my bookshell
Add to citation manager
Cite this article
Email Alert
Feedback
View Feedback
Keywords
metagenomics
next-generation sequencing (NGS)
taxonomic/functional profiling
statistical approaches
comparative metagenomics
Authors
John C. Wooley
Yuzhen Ye

Metagenomics: Facts and Artifacts, and Computational Challenges

John C. Wooley1 and Yuzhen Ye2 (叶玉珍)

1Center for Research on BioSystems, Calit2, University of Califormia San Diego, La Jolla, CA 92093, U.S.A.
2School of Informatics and Computing, Indiana University, Bloomington, Indiana, 47408, U.S.A.

Abstract

Metagenomics is the study of microbial communities sampled directly from their natural environment, without prior culturing. By enabling an analysis of populations including many (so-far) unculturable and often unknown microbes, metagenomics is revolutionizing the field of microbiology, and has excited researchers in many disciplines that could benefit from the study of environmental microbes, including those in ecology, environmental sciences, and biomedicine. Specific computational and statistical tools have been developed for metagenomic data analysis and comparison. New studies, however, have revealed various kinds of artifacts present in metagenomics data caused by limitations in the experimental protocols and/or inadequate data analysis procedures, which often lead to incorrect conclusions about a microbial community. Here, we review some of the artifacts, such as overestimation of species diversity and incorrect estimation of gene family frequencies, and discuss emerging computational approaches to address them. We also review potential challenges that metagenomics may encounter with the extensive application of next-generation sequencing (NGS) techniques.

Keywords metagenomics    next-generation sequencing (NGS)    taxonomic/functional profiling    statistical approaches    comparative metagenomics  
Received: 2009-08-30 Accepted: 2009-11-16 Online:  
DOI:
Fund:

This work is supported by NIH under Grant No. 1R01HG004908-01, NSF of USA under Grant No. DBI-0845685 (YY), and also the Gordon and Betty Moore Foundation for the Community Cyberinfrastructure for Marine Microbial Ecological Research and Analysis (CAMERA) Project (JW).

Email: jwooley@ucsd.edu; yye@indiana.edu
About author(s):
John C. Wooley is associate vice chancellor of research and professor of Chemistry-Biochemistry and of Pharmacology at UC San Diego. He now largely focuses on structural genomics (SG) and metagenomics (MG), along with various bioinformatics and computational methods for probing SG and MG data and for community engagement via web services technology.
Yuzhen Ye is an assistant professor at the School of Informatics and Computing, Indiana University, Bloomington. Ye received her Ph.D. degree in computational biology from Shanghai Institute of Biochemistry, Chinese Academy of Sciences in 2001. Her research interests are in the areas of bioinformatics (especially, structural bioinformatics), and computational metagenomics. Ye received an NIH grant for developing computational tools for the human microbiome project in 2008, and an NSF CAREER Award in 2009.

Other similar articles

Copyright 2008 by Journal of Computer Science and Technology