Apologies! After digging in the Ion Community a little more, I think this is the updated link for V1.4 TS
Created on: Jul 7, 2011 4:29 PM by
ghartsell -
Last Modified:
Jul 11, 2011 1:48 PM
by ghartsell
But the manually created reference index doesn't appear in the final dropdown menu when I try to do realignment (it does appear in the reference tab)
Don't really understand this line "As of release 1.1.0, only the "tmap-f1" index_type is supported."
as the index i created had the info.txt with tmap-f2
In anycase, if you don't mind fiddling with the web browser and you met with 'file deleted' or job started and you still do not have ur index . you can
restart ionJObServer
sudo /etc/init.d/ionJobServer restart
Adding
a New Genome Index
As
part of the standard analysis process reads are aligned to a genomic
reference and the alignments and some summary statistics based on the
alignments are included in the analysis report page. This HOWTO
describes the process to add a new reference genome, something that
will be necessary when a user starts to work with a new genome
sequence.
The
aligner used is named tmap and it comes pre-installed on the Torrent
Server.
Prerequisites
Before
we begin, you will need your reference sequence in a single file
in fasta
format and
you will need command-line access to the Torrent Server. Please
note that it must have Unix line endings and not Windows line
endings. (they can be in .zip compressed format but i didn't test
this)
You
will need admin rights to scp the files over to
/results/referenceLibrary/tmap-f2/
Procedure
Select
a Short Form of Genome Name
The
short form of genome name is the name that you would like the
reference option to appear when initiating a run on the PGM™
instrument. There are some rules on how to define the short form of
the genome name.
-
it
should also be comprised solely of alphanumeric characters,
underscore ("_") and period (".")
Index
Creation
The
alignment package ( ion-alignment )
comes a wrapper script, build_genome_index.pl,
that automates the TMAP index creation process. It requires four
inputs:
-
short
form of the genome name (see previous section)
long
form of the genome name (see next section for description)
genome
version (see next section for description)
The
steps to create the index:
-
-
$ cd /results/referenceLibrary/tmap-f2/
$ build_genome_index.pl --fasta A_flavithermus.fasta -s A_flavithermus
-v "gi|212637849|ref|NC_011567.1"
-l "Anoxybacillus flavithermus WK1 chromosome complete genome"
Copying A_flavithermus.fasta to A_flavithermus/A_flavithermus.fasta...
...copy complete
Making tmap index...
...tmap index complete
Making samtools index...
...samtools index complete
There
should now be 10 files in the directory, including the original fasta
file. The size of the files varies by genome - for the human
genome (3,000,000,000 bases in length) the combined size of all index
files, including the original fasta file itself, is just under
8Gb. For E. coli (4,600,000 bases in length) it is about 0.4Gb.
You
might want to del the fasta file to keep things tidy
rm
A_flavithermus.fasta
$ ls -1 /results/referenceLibrary/tmap-f1/e_coli/
A_flavithermus.fasta
A_flavithermus.fasta.fai
A_flavithermus.fasta.md5
A_flavithermus.fasta.tmap.anno
A_flavithermus.fasta.tmap.bwt
A_flavithermus.fasta.tmap.pac
A_flavithermus.fasta.tmap.rbwt
A_flavithermus.fasta.tmap.rpac
A_flavithermus.fasta.tmap.rsa
A_flavithermus.fasta.tmap.sa
A_flavithermus.info.txt
samtools.log
tmap.log
Adding
the Genome to the PGM Drop-down Menu
For
additional convenience it is also recommended (though not required)
to add the genome to the list that is made available on the PGM as a
drop-down menu - this can be very helpful in avoiding typos on the
PGM.
updateref will
crawl through the directory and grab the genome_shortname fields
from all installed reference library of the version specified and
overwrite reference_list.txt. When updateref is
called without any command line argument, it will assume the default
settings. For example, /results/PGM_config is
the location of PGM configuration. The location is crucial because it
needs to be under the same root directory to which PGM transfer the
data. For example, if PGMs transfer data to a file server, which is
mounted as /mnt/PGM_Data on
Torrent server, an option -p
/mnt/PGM_Data/PGM_config needs
to be specified. updateref
--help will
list more options.
Default
settings. PGM data are stored in /results.
$ sudo updateref
List of library
-> ampl_valid
-> vibrio_fisch
-> e_coli_k12
-> e_coli_dh10b
-> rhodopalu
Customized
environment. PGM data are stored in /mnt/PGM_Data.
$ sudo updateref -p /mnt/PGM_Data/PGM_config
You may also manually edit the text file ( I did this as I can't
find updateref)
sudo vim
/results/PGM_config/reference_list.txt
insert the shortname into the txt
file
Update:manually editing the text file doesn't make the genome appear in libraries for realignment plugin. Curiously after adding the reference genome via the web browser, the genome name doesn't appear here.