Tuesday, 30 November 2010

Why can't Bioscope / mapreads write to bam natively?

Spotted this small fact in Bioscope 1.3.1 release notes.

There is significant disk space required for converting ma to BAM
  when the option output.filter=none is used, which roughly needs
  2TB peak disk space for converting a 500 million reads ma file.
  Other options do not need such large peak disk space. The disk
  space required per node is smaller if more jobs are dispatched to
  more nodes.

I would love to see the calculation on how they arrived at the figure of 2 TB. I am glad that they moved to bam in bioscope workflow but I am not entirely sure what's the reason for keeping the .ma file format when only they are the ones using it.

No comments:

Post a Comment

Datanami, Woe be me