Thursday 31 January 2013

Fast and accurate read mapping with approximate seeds and multiple backtracking.


 2013 Jan 28. [Epub ahead of print]

Fast and accurate read mapping with approximate seeds and multiple backtracking.

Source

Department of Mathematics and Computer Science, Freie Universität Berlin, Takustr. 9, 14195 Berlin, Germany and Max Planck Institute for Molecular Genetics, Ihnestr. 63-73, 14195 Berlin, Germany.

Abstract

We present Masai, a read mapper representing the state-of-the-art in terms of speed and accuracy. Our tool is an order of magnitude faster than RazerS 3 and mrFAST, 2-4 times faster and more accurate than Bowtie 2 and BWA. The novelties of our read mapper are filtration with approximate seeds and a method for multiple backtracking. Approximate seeds, compared with exact seeds, increase filtration specificity while preserving sensitivity. Multiple backtracking amortizes the cost of searching a large set of seeds by taking advantage of the repetitiveness of next-generation sequencing data. Combined together, these two methods significantly speed up approximate search on genomic data sets. Masai is implemented in C++ using the SeqAn library. The source code is distributed under the BSD license and binaries for Linux, Mac OS X and Windows can be freely downloaded from http://www.seqan.de/projects/masai.
PMID:
 
23358824
 
[PubMed - as supplied by publisher]

No comments:

Post a Comment

Datanami, Woe be me