Tuesday 9 November 2010

SOLiD™ BioScope™ Software v1.3 releasing soon

v1.3 is due for release soon! How do I know other than the fact that you can register for v1.3 video tutorials , e.g. SOLiD™ Targeted ReSeq Data Analysis featuring BioScope 1.3 (1 hour)
The clue comes from new documentation that is being uploaded on to solidsoftwaretools.com.


BioScope™ Software v1.3 adds/enhances support for following:
  •     Targeted Resequencing analysis (enrichment statistics and target
  •     filtering)
  •     BFAST integration
  •     Annotation, reporting and statistics generation
  •     Methylation analysis
  •     75 bp read length support
  •     Mapping and Pairing speed improvements

It also fixes a long list of bugs I won't repeat all of them here.
But the important ones are

  • Bug – Pairing: In BAM file, readPaired and firstOfPair/secondOfPair flags set incorrectly for reads with missing mates.
  •   Bug – diBayes: Defunct java processes continue when bioscope exits 
  • Bug – Mapping: When the last batch of the processing has the number of reads less than the value of the key mapping.np.per.node, the ma file contains duplicated entries.
     

Have fun playing with the new version when it's up!
here's some impt notes:


  It is advised that a user runs BioScope using the user’s own user
  account. Then if Control-C is used to interrupt bioscope.sh which
  spawns many other processes, user can use following OS commands
  to find the pid of the left-over processes, and clean them up.
  ps –efl | grep bioscope.sh | grep username
  ps –efl | grep java_app.sh | grep username
  ps –efl | grep map | grep username
  ps –efl | grep java | grep username
  ps –efl | grep mapreads | grep username
  ps –efl | grep pairing | grep username
  kill -9 PID


Oh but I would use the command highlighted in bold carefully as basically it kills all process that have the name java in it

My suggestion to the team is to have a db table  to keep the PID of launched processes instead of depending on non-unique names. Ensembl's pipeline uses perl with less overhead  to track jobs and it is much cleaner to clear up.

No comments:

Post a Comment

Datanami, Woe be me