Thursday, 31 December 2009

use csplit to split fasta files

Got this off the net a looong while back. Sorry I can't attribute the source. please drop a comment if you know the orginal author. Naming the files using a increasing counter is a godsend if you wanna batch qsub / pbs jobs.

# split fasta file into separate sequence files

if [ $# -gt 1 ]
 echo "Use: fsplit SEQFILE DESTDIR"
 echo "     Splits fasta file SEQFILE into separate files in DESTDIR folder"

mkdir $2
#names the fa files as sequence00 i.e. with padding
#csplit -f $destdir/sequence $seqfile "%^>%" "/^>/" "{*}" -s
#names the fa files as sequence0 i.e. without padding
csplit -n 1 -f $destdir/sequence $seqfile "%^>%" "/^>/" "{*}" -s

No comments:

Post a Comment

Datanami, Woe be me