Wednesday 14 September 2011

Adding custom reference genome to Torrent Server manually - My Experience


Apologies! After digging in the Ion Community a little more, I think this is the updated link for V1.4 TS 

Adding a New Genome Index 

Created on: Jul 7, 2011 4:29 PM by ghartsell - Last Modified:  Jul 11, 2011 1:48 PM by ghartsell

But the manually created reference index doesn't appear in the final dropdown menu when I try to do realignment (it does appear in the reference tab) 


Don't really understand this line "
As of release 1.1.0, only the "tmap-f1" index_type is supported." 

as the index i created had the info.txt with tmap-f2

 In anycase, if you don't mind fiddling with the web browser and you met with 'file deleted' or job started and you still do not have ur index . you can 



restart ionJObServer
                        sudo /etc/init.d/ionJobServer restart

Adapted from the original doc here 

Adding a New Genome Index


As part of the standard analysis process reads are aligned to a genomic reference and the alignments and some summary statistics based on the alignments are included in the analysis report page.  This HOWTO describes the process to add a new reference genome, something that will be necessary when a user starts to work with a new genome sequence.

The aligner used is named tmap and it comes pre-installed on the Torrent Server.

 Prerequisites 


Before we begin, you will need your reference sequence in a single file in
 fasta format and you will need command-line access to the Torrent Server.  Please note that it must have Unix line endings and not Windows line endings. (they can be in .zip compressed format but i didn't test this)
You will need admin rights to scp the files over to /results/referenceLibrary/tmap-f2/

 Procedure 

 Select a Short Form of Genome Name 

The short form of genome name is the name that you would like the reference option to appear when initiating a run on the PGM™ instrument. There are some rules on how to define the short form of the genome name.
  1. it should not match any of the existing references installed under the standard reference location 
  2. it should also be comprised solely of alphanumeric characters, underscore ("_") and period (".")
 Index Creation 

The alignment package (
 ion-alignment ) comes a wrapper script, build_genome_index.pl, that automates the TMAP index creation process. It requires four inputs:
  • single FASTA file
  • short form of the genome name (see previous section)
  • long form of the genome name (see next section for description)
  • genome version (see next section for description)

The steps to create the index:
  1. move or copy the FASTA file to the standard reference location 
$ cd /results/referenceLibrary/tmap-f2/
$ build_genome_index.pl --fasta A_flavithermus.fasta -s A_flavithermus 
-v "gi|212637849|ref|NC_011567.1" 
-l "Anoxybacillus flavithermus WK1 chromosome complete genome"

Copying A_flavithermus.fasta to A_flavithermus/A_flavithermus.fasta...

  ...copy complete

Making tmap index...

  ...tmap index complete

Making samtools index...

  ...samtools index complete

There should now be 10 files in the directory, including the original fasta file.  The size of the files varies by genome - for the human genome (3,000,000,000 bases in length) the combined size of all index files, including the original fasta file itself, is just under 8Gb.  For E. coli (4,600,000 bases in length) it is about 0.4Gb.
You might want to del the fasta file to keep things tidy
rm A_flavithermus.fasta
$ ls -1 /results/referenceLibrary/tmap-f1/e_coli/
A_flavithermus.fasta

A_flavithermus.fasta.fai

A_flavithermus.fasta.md5

A_flavithermus.fasta.tmap.anno

A_flavithermus.fasta.tmap.bwt

A_flavithermus.fasta.tmap.pac

A_flavithermus.fasta.tmap.rbwt

A_flavithermus.fasta.tmap.rpac

A_flavithermus.fasta.tmap.rsa

A_flavithermus.fasta.tmap.sa

A_flavithermus.info.txt

samtools.log

tmap.log

 Adding the Genome to the PGM Drop-down Menu 

For additional convenience it is also recommended (though not required) to add the genome to the list that is made available on the PGM as a drop-down menu - this can be very helpful in avoiding typos on the PGM.

 updateref will crawl through the directory and grab the genome_shortname fields from all installed reference library of the version specified and overwrite reference_list.txt. When updateref is called without any command line argument, it will assume the default settings. For example, /results/PGM_config is the location of PGM configuration. The location is crucial because it needs to be under the same root directory to which PGM transfer the data. For example, if PGMs transfer data to a file server, which is mounted as /mnt/PGM_Data on Torrent server, an option -p /mnt/PGM_Data/PGM_config needs to be specified. updateref --help will list more options.

Default settings. PGM data are stored in
 /results.
$ sudo updateref
List of library
-> ampl_valid
-> vibrio_fisch
-> e_coli_k12
-> e_coli_dh10b
-> rhodopalu

Customized environment. PGM data are stored in
 /mnt/PGM_Data.
$ sudo updateref -p /mnt/PGM_Data/PGM_config
You may also manually edit the text file ( I did this as I can't find updateref)
sudo vim /results/PGM_config/reference_list.txt
insert the shortname into the txt file


Update:manually editing the text file doesn't make the genome appear in libraries for realignment plugin. Curiously after adding the reference genome via the web browser, the genome name doesn't appear here.

3 comments:

  1. Some of this information is out of date and only applies to previous versions of Torrent Suite Software. It is best for users go to the Ion Community directly at http://ioncommunity.iontorrent.com for the most up to date information.

    ReplyDelete
  2. Hi I got this information recently from the url from the top which is from the ion community page. and indeed it is outdated. Do you have the url to the updated link within the community? Please share!

    ReplyDelete
  3. New references can be uploaded directly through Torrent Browser under the "References" Tab.

    ReplyDelete

Datanami, Woe be me