新算法对单细胞基因组进行测序


新算法对单细胞基因组进行测序

美国加州大学圣地亚哥分校计算机科学家Pavel Pevzner领导的一个国际研究小组开发出一种新的算法来对有机体单个细胞的基因组进行更加快速地和更加准确地测序。这个新的算法被称作SPAdes,能够被用来对不能利用标准克隆技术进行处理的细菌进行测序,其中研究人员将这些细菌称作生命的暗物质,比如医院中发现的一些病原体,在深海或人胃肠道中存在的细菌。最终,研究人员希望将这种算法应用到癌细胞以便监控这种疾病发生的早期阶段,即正常细胞首先转变为癌细胞的时候。Pevzner和同事们将它们的发现发表在2012年5月那期Journal of Computational Biology期刊上。他们在2012年8月8日发布了SPAdes算法。

2011年秋季,Pevzner领导的研究小组与J. Craig Venter研究所单细胞测序先驱Roger Lasken和Illumina公司研究人员合作,从而开发出第一个能够处理单细胞测序的软件。2011年9月,研究人员在Nature Biotechnology期刊上发布了这些发现。就在此几个月之前,一种新的测序算法fwas也被开发出,这足以说明在单细胞测序领域正取得快速进展。如今,Pevzner领导的研究小组正在利用SPAdes算法对生命暗物质细菌和人病原体进行测序。

在此之前,研究人员先是开发出一系列费时费力的生物信息学测序软件包(boot camp),就在两个月之后,他们开始开发SPAdes基因组拼接软件(genomes assembler)。六个月后,他们开发出新的非常精准的拼接软件。Pevzner说,基因组片段拼接就像是拼图游戏中要将几百万个小碎片组装在一起,因而它经常被视为生物信息学中最为复杂的问题之一。在这项研究中,他们开发出的这种新的SPAdes算法有助于解决这种片段拼接难题

相关的文章:

Efficient de novo assembly of single-cell bacterial genomes from short-read data sets

Hamidreza Chitsaz, Joyclyn L Yee-Greenbaum, Glenn Tesler, Mary-Jane Lombardo, Christopher L Dupont, Jonathan H Badger, Mark Novotny, Douglas B Rusch, Louise J Fraser, Niall A Gormley, Ole Schulz-Trieglaff, Geoffrey P Smith, Dirk J Evers, Pavel A Pevzner & Roger S Lasken

Whole genome amplification by the multiple displacement amplification (MDA) method allows sequencing of DNA from single cells of bacteria that cannot be cultured. Assembling a genome is challenging, however, because MDA generates highly nonuniform coverage of the genome. Here we describe an algorithm tailored for short-read data from single cells that improves assembly through the use of a progressively increasing coverage cutoff. Assembly of reads from single Escherichia coli and Staphylococcus aureus cells captures >91% of genes within contigs, approaching the 95% captured from an assembly based on many E. coli cells. We apply this method to assemble a genome from a single cell of an uncultivated SAR324 clade of Deltaproteobacteria, a cosmopolitan bacterial lineage in the global ocean. Metabolic reconstruction suggests that SAR324 is aerobic, motile and chemotaxic. Our approach enables acquisition of genome assemblies for individual uncultivated bacteria using only short reads, providing cell-specific genetic information absent from metagenomic studies.

全文链接:http://www.nature.com/nbt/journal/v29/n10/full/nbt.1966.html

SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing

Anton Bankevich,1,2 Sergey Nurk,1,2 Dmitry Antipov,1 Alexey A. Gurevich,1 Mikhail Dvorkin,1 Alexander S. Kulikov,1,3 Valery M. Lesin,1 Sergey I. Nikolenko,1,3 Son Pham,4 Andrey D. Prjibelski,1 Alexey V. Pyshkin,1 Alexander V. Sirotkin,1 Nikolay Vyahhi,1 Glenn Tesler,5 Max A. Alekseyev,1,6 and Pavel A. Pevzner

The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V−SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online (http://bioinf.spbau.ru/spades). It is distributed as open source software.

全文链接:http://online.liebertpub.com/doi/abs/10.1089/cmb.2012.0021

参考来源:http://www.bioon.com/biology/postgenomics/527925.shtml

http://phys.org/news/2012-08-team-method-sequencing-dark-life.html

发表评论

匿名网友

拖动滑块以完成验证