The clue comes from new documentation that is being uploaded on to solidsoftwaretools.com.
BioScope™ Software v1.3 adds/enhances support for following:
- Targeted Resequencing analysis (enrichment statistics and target
- filtering)
- BFAST integration
- Annotation, reporting and statistics generation
- Methylation analysis
- 75 bp read length support
- Mapping and Pairing speed improvements
It also fixes a long list of bugs I won't repeat all of them here.
But the important ones are
- Bug – Pairing: In BAM file, readPaired and firstOfPair/secondOfPair flags set incorrectly for reads with missing mates.
- Bug – diBayes: Defunct java processes continue when bioscope exits
- Bug – Mapping: When the last batch of the processing has the number of reads less than the value of the key mapping.np.per.node, the ma file contains duplicated entries.
Have fun playing with the new version when it's up!
here's some impt notes:
It is advised that a user runs BioScope using the user’s own user
account. Then if Control-C is used to interrupt bioscope.sh which
spawns many other processes, user can use following OS commands
to find the pid of the left-over processes, and clean them up.
ps –efl | grep bioscope.sh | grep username
ps –efl | grep java_app.sh | grep username
ps –efl | grep map | grep username
ps –efl | grep java | grep username
ps –efl | grep mapreads | grep username
ps –efl | grep pairing | grep username
kill -9 PID
Oh but I would use the command highlighted in bold carefully as basically it kills all process that have the name java in it
My suggestion to the team is to have a db table to keep the PID of launched processes instead of depending on non-unique names. Ensembl's pipeline uses perl with less overhead to track jobs and it is much cleaner to clear up.
No comments:
Post a Comment