Kevin's GATTACA World: Adding custom reference genome to Torrent Server manually

Wednesday, 14 September 2011

Adding custom reference genome to Torrent Server manually - My Experience

Apologies! After digging in the Ion Community a little more, I think this is the updated link for V1.4 TS

Adding a New Genome Index

Created on: Jul 7, 2011 4:29 PM by ghartsell - Last Modified: Jul 11, 2011 1:48 PM by ghartsell

But the manually created reference index doesn't appear in the final dropdown menu when I try to do realignment (it does appear in the reference tab)

Don't really understand this line "As of release 1.1.0, only the "tmap-f1" index_type is supported."

as the index i created had the info.txt with tmap-f2

In anycase, if you don't mind fiddling with the web browser and you met with 'file deleted' or job started and you still do not have ur index . you can

restart ionJObServer
sudo /etc/init.d/ionJobServer restart

Adapted from the original doc here

Adding a New Genome Index

As part of the standard analysis process reads are aligned to a genomic reference and the alignments and some summary statistics based on the alignments are included in the analysis report page. This HOWTO describes the process to add a new reference genome, something that will be necessary when a user starts to work with a new genome sequence.

~~The aligner used is named tmap and it comes pre-installed on the Torrent Server.~~

Prerequisites

Before we begin, you will need your reference sequence in a single file in fasta format and you will need command-line access to the Torrent Server. Please note that it must have Unix line endings and not Windows line endings. (they can be in .zip compressed format but i didn't test this)

~~You will need admin rights to scp the files over to /results/referenceLibrary/tmap-f2/~~

Procedure

Select a Short Form of Genome Name

The short form of genome name is the name that you would like the reference option to appear when initiating a run on the PGM™ instrument. There are some rules on how to define the short form of the genome name.

~~it should not match any of the existing references installed under the standard reference location~~
~~it should also be comprised solely of alphanumeric characters, underscore ("_") and period (".")~~

Index Creation

~~The alignment package ( ion-alignment ) comes a wrapper script, build_genome_index.pl, that automates the TMAP index creation process. It requires four inputs:~~

~~single FASTA file~~
~~short form of the genome name (see previous section)~~
~~long form of the genome name (see next section for description)~~
~~genome version (see next section for description)~~

~~The steps to create the index:~~

~~move or copy the FASTA file to the standard reference location~~
~~run under the standard reference location~~

$ cd /results/referenceLibrary/tmap-f2/
$ build_genome_index.pl --fasta A_flavithermus.fasta -s A_flavithermus

-v "gi|212637849|ref|NC_011567.1"

-l "Anoxybacillus flavithermus WK1 chromosome complete genome"

Copying A_flavithermus.fasta to A_flavithermus/A_flavithermus.fasta...

  ...copy complete

Making tmap index...

  ...tmap index complete

Making samtools index...

  ...samtools index complete

There should now be 10 files in the directory, including the original fasta file. The size of the files varies by genome - for the human genome (3,000,000,000 bases in length) the combined size of all index files, including the original fasta file itself, is just under 8Gb. For E. coli (4,600,000 bases in length) it is about 0.4Gb.

~~You might want to del the fasta file to keep things tidy~~

~~rm A_flavithermus.fasta~~

$ ls -1 /results/referenceLibrary/tmap-f1/e_coli/
A_flavithermus.fasta

A_flavithermus.fasta.fai

A_flavithermus.fasta.md5

A_flavithermus.fasta.tmap.anno

A_flavithermus.fasta.tmap.bwt

A_flavithermus.fasta.tmap.pac

A_flavithermus.fasta.tmap.rbwt

A_flavithermus.fasta.tmap.rpac

A_flavithermus.fasta.tmap.rsa

A_flavithermus.fasta.tmap.sa

A_flavithermus.info.txt

samtools.log

tmap.log

Adding the Genome to the PGM Drop-down Menu

For additional convenience it is also recommended (though not required) to add the genome to the list that is made available on the PGM as a drop-down menu - this can be very helpful in avoiding typos on the PGM.

updateref will crawl through the directory and grab the genome_shortname fields from all installed reference library of the version specified and overwrite reference_list.txt. When updateref is called without any command line argument, it will assume the default settings. For example, /results/PGM_config is the location of PGM configuration. The location is crucial because it needs to be under the same root directory to which PGM transfer the data. For example, if PGMs transfer data to a file server, which is mounted as /mnt/PGM_Data on Torrent server, an option -p /mnt/PGM_Data/PGM_config needs to be specified. updateref --help will list more options.

~~Default settings. PGM data are stored in /results.~~

$ sudo updateref
List of library
-> ampl_valid
-> vibrio_fisch
-> e_coli_k12
-> e_coli_dh10b
-> rhodopalu

~~Customized environment. PGM data are stored in /mnt/PGM_Data.~~

$ sudo updateref -p /mnt/PGM_Data/PGM_config

~~You may also manually edit the text file ( I did this as I can't find updateref)~~

~~sudo vim /results/PGM_config/reference_list.txt~~

~~insert the shortname into the txt file~~

Update:manually editing the text file doesn't make the genome appear in libraries for realignment plugin. Curiously after adding the reference genome via the web browser, the genome name doesn't appear here.

3 comments:

Anonymous15 September 2011 at 22:16
Some of this information is out of date and only applies to previous versions of Torrent Suite Software. It is best for users go to the Ion Community directly at http://ioncommunity.iontorrent.com for the most up to date information.
ReplyDelete
Replies
Kevin15 September 2011 at 22:19
Hi I got this information recently from the url from the top which is from the ion community page. and indeed it is outdated. Do you have the url to the updated link within the community? Please share!
ReplyDelete
Replies
Anonymous18 September 2011 at 07:55
New references can be uploaded directly through Torrent Browser under the "References" Tab.
ReplyDelete
Replies

Add comment

Kevin's GATTACA World

Wednesday, 14 September 2011

Adding custom reference genome to Torrent Server manually - My Experience

Apologies! After digging in the Ion Community a little more, I think this is the updated link for V1.4 TS

Adding a New Genome Index

Created on: Jul 7, 2011 4:29 PM by ghartsell - Last Modified: Jul 11, 2011 1:48 PM by ghartsell

But the manually created reference index doesn't appear in the final dropdown menu when I try to do realignment (it does appear in the reference tab)

Don't really understand this line "As of release 1.1.0, only the "tmap-f1" index_type is supported."

as the index i created had the info.txt with tmap-f2

In anycase, if you don't mind fiddling with the web browser and you met with 'file deleted' or job started and you still do not have ur index . you can

Adapted from the original doc here

Adding a New Genome Index

Prerequisites

Procedure

Select a Short Form of Genome Name

Index Creation

Adding the Genome to the PGM Drop-down Menu

3 comments:

Datanami, Woe be me

Analytics code

Contributors