There's lots of stuff you can do with sed awk grep in linux.
Below is a few commands that I modified from samtools helplist that will help others in trying to sieve out mapping info from sam files
Basically it works by looking at col 2 of the sam file for the flag to see if the read is mapped, unmapped. Look at above link for more info on the sam format.
#extract unmapped reads from sam file
gawk '!/^@/&&and($2,4)' aln.sam > aln.sam.unmapped.reads
#extracts readname & name of seq in reference where read is mapped
gawk '!/^@/&&!and($2,4){print $1 $3}' aln.sam > aln.sam.mapped.reads
#counts the number of reads mapped to seq in reference
gawk '{print $3}' aln.sam.mapped.reads|sort |uniq -c |sort -n > aln.sam.mapped.reads.stats
No comments:
Post a Comment