Wednesday, 6 June 2012

Fwd: [Velvet-users] - choose a good k-value for your genome automatically

From: Torsten Seemann

Hi all,

I have written a simple script to choose (or list) good k-values for
YOUR data with YOUR genome.

It's called and it needs two things:
(1) the target genome size (can supply a number eg. 4.8M) or a fasta
file of a close reference
(2) your read files (fasta/fastq  and uncompressed/bzip2/gzip should work)

Example uses might be:

# For manual examination
% --size 3.8M reads.fastq  morereads.fa.gz morereads.fq.bz2 paired.fa
K       #Kmers  Kmer-Cov
91      34649310        34.6
93      27719448        27.7
95      20789586        20.8

# For automated scripts
% --genome Ecoli.fna --best reads.fastq  morereads.fa.gz

You can download it from here:

If it is deemed to work well, then we will aim to:
1. incorporate it as "velvetk" in the Velvet distribution
2. rewrite in "C" if needed
3. add a new "auto" option instead of a fixed k-value in velveth.

--Dr Torsten Seemann
--Scientific Director : Victorian Bioinformatics Consortium, Monash
University, AUSTRALIA
--Senior Researcher : VLSCI Life Sciences Computation Centre,
Parkville, AUSTRALIA
Velvet-users mailing list

No comments:

Post a Comment

Datanami, Woe be me