Thursday, 12 January 2012

Cortex assembler paper


"De novo assembly and genotyping of variants using colored de Bruijn graphs", Iqbal, Caccamo, Turner, Flicek, McVean
Nature Genetics, (doi:10.1038/ng.1028)

This link will work for a bit
http://www.nature.com/ng/journal/vaop/ncurrent/full/ng.1028.html

You may be interested in some of the following things we cover

 - low and predictable memory use
 - simultaneous assembly of multiple samples, and variant calling done directly (without assembling a consensus first)
  (eg you could assemble over 2000 S. aureus in 32Gb of RAM or 10 humans in 256Gb of RAM).
 - a mathematical model extending the Lander-Waterman statistics to include information on repeat content,
  allowing you to make choices of kmer-size depending on what you want to achieve
 - validation using fully sequenced fosmids
 - comparison of Cortex variant calls with 1000genomes pilot calls
 - showing you can make good variant calls without using a reference if you sequence multiple samples from a population (we did this with chimps)
 - a proof-of-concept of HLA-typing at HLA-B using whole genome (not pull-down) data

No comments:

Post a Comment

Datanami, Woe be me