Tuesday 22 March 2011

de novo assembly of Illumina CEO genome in 11.5 h - new ver of Ray

Kevin:You can't ignore an email with that subject header.. but 512 compute cores? Shall have a chat with my HPC vendor.. 
Also am waiting for public release of Cortex http://sourceforge.net/projects/cortexassembler/
Strange that courses that teach the software are available but the software ain't ... 
http://www.ebi.ac.uk/training/onsite/NGS_120510.html


Velvet and Curtain seems promising for de novo assembly as well.

Ray 1.3.0 is now available online.
http://sourceforge.net/projects/denovoassembler/files/Ray-1.3.0.tar.bz2

The most important change is the correction of a major bug that caused
parallel infinite loop on the human genome.

This, along concepts incorporated in Ray 1.2.4, allowed Ray to assemble
the genome of Illumina's CEO in 11.5 hours using 512 compute cores (see
below for the link).

What's new?

1.3.0

2011-03-22

   * Vertices with less than 1 of coverage are ignored during the
computation of seeds and during the computation of extensions.
   * Computation of library outer distances relies on the virtual
communicator.
   * Expiry positions are used to toss away reads that are out-of-range
   * When only one choice is given during the extension and some reads
are in-range, then the sole choice is picked up.
   * Fixed a bug for empty reads.
   * A read is not added in the active set if it is marked on a
repeated vertex and its mate was not encountered yet.
   * Grouped messages in the extension of seeds.
   * Reads marked on repeated vertices are cached during the extension.
   * Paths are cached in the computation of fusions.
   * Fixed an infinite loop in the extension of seeds.
   * When fetching read markers for a vertex, send a list of mates to
meet if the vertex is repeated in order to reduce the communication.
   * Updated the Instruction Manual
   * Added a version of the logo without text.


I fixed a bug that caused an infinite loop. Now Ray can assemble large
genomes. See my blog post for more detail about that.
http://dskernel.blogspot.com/2011/03/de-novo-assembly-of-illumina-ceo-genome.html


Version 1.2.4 of Ray incorporated also new concepts that I will present
at RECOMB-Seq 2011.

The talk is available online:
http://boisvert.info/dropbox/recomb-seq-2011-talk.pdf


Sébastien Boisvert

No comments:

Post a Comment

Datanami, Woe be me