For those that have trouble navigating the Ion Community for useful info amidst the tangled web here's a good guide that arrived in my inbox.
Where to find Information
The various Ion Community sites are the best place to start when looking for information quickly. There are a good range of technical documents, troubleshooting advice and how-to videos that have been compiled by both PGM users and staff. The Ion Community consists of 4 sites:- Introduction to Ion Torrent - An introduction to semiconductor sequencing (no registration necessary)
- PGM Users Community - Wet lab site with protocols, user guides, training, documentation (registration necessary)
- Torrent Dev Community Bioinformatics site with data, software, training, troubleshooting advice (registration necessary)
- Grand Challenge Community - Challenges with $7 million in cash prizes aimed to drive innovation on the PGM (registration necessary)
- Ion Torrent Publications - See what others have published using Ion Torrent data
Bioinformatics Videos on the Ion Community
- Introduction to Ion Torrent Software
- Torrent Suite Runs Tab
- Report & Reference Tabs in Torrent Suite Software
- Working with Reference Genomes in Torrent Suite
- Torrent Suite Software Realignment Plugin
- Introduction to the Torrent Suite Development Framework
General Documentation
- Torrent Suite User Documentation (Version 1.3)
- Torrent Suite User Documentation (Version 1.4)
- Torrent Suite Software Development Kit (SDK) Information
Was the Run Successful?
The wet lab training covers how to read the run report and a basic understanding of the metrics it contains. A complete description of the information in the report, how it is generated and what it means can be found in the Torrent Browser Analysis Report Guide. There is also a video.
Data Analysis Software
Currently, most PGM users are making complete alignments on their Torrent Server. These alignments are made by default in order to produce run statistics. There is the possibility to randomly sample from the FASTQ file, but due to the manageable output from the 314 and 316 chips, the sampling is turned off by default (meaning hat 100% of reads are used in this alignment). A summary of the files that are automatically created on the Torrent Server can be found here.New in Torrent Suite 1.4, there is an Alignment Plugin that can perform a new alignment after the run is complete (instructional video). There is also a Variant Calling Plugin which uses mPileup from SAMtools to call SNPs and indels in VCF format. However, using the plugins in Torrent Suite, you will not get the full range of options that are available for these tools. The extended options are available if you run them through command line on your offline server.
The standalone versions of these software are freely available but since they do not have a graphic interface, their use is intended for bioinformaticians and researchers with Linux experience.
- TMAP - Aligns PGM reads to a reference. This is the software used to make alignments on the Torrent Server. A description of the software can be found here and discussion on parameters for offline analysis can be found here. There is even a manual.
- SAMtools - Various SAM/BAM manipulation tools and variant caller
Life Technologies Partners
- DNASTAR SeqMan - SeqMan videos and tutorials are here (must log in)
- Avadis NGS
- Partek
- SoftGenetics NextGENe - Access how-to videos posted by SoftGenetics here (must log in)
Other Vendors
Technology Overview
A high level overview to the technology along with some videos can be found here.Hardware
This is a breakdown of the various hardware components of the PGM and Torrent Server.PGM
- Wind River Linux Operating System - http://www.bsdi.com/
- Two 2TB drives
- One is used for the operating system and the other is used for the results storage
Torrent Server
- Ubuntu (10.04 LTS) Lucid Lynx Linux Operating System - http://www.ubuntu.com
- Eight 2TB drives
- Formatted RAID5 leaving 11TB of usable space and can withstand two simultaneous drive failures
- Two 6-core CPUs
- 48GB RAM
- More information can be found here
Key Files and Locations
A quick summary of the location of key files and the file structure on the Torrent Server.- /results/[PGM]/[Run]/ - Storage location for primary data files copied from PGM
- DAT files - the raw voltage files copied over from the PGM
- explog.txt - Contains name-value pairs with run level information that was entered on the PGM
- The presence of this file and the last DAT file are what causes ionCrawler to insert the new run into the database
- explog_final.txt - Same as explog.txt
- The presence of this file is what signifies the end of the analysis and if enabled can be used as a trigger for the PGM to auto-remove the run from the instrument
- /results/analysis/output/Home/[Analysis]/ - Storage location for primary analysis data and BAM file if mapping was done
- *.sff.zip - Zip file containing the SFF file
- *.fastq.zip - Zip file containing the FASTQ file
- *.bam - BAM file containing alignment data against specfied reference
- *.support.zip - Archieve containing system level information and logs from when the analysis was done
- /results/analysis/output/Home/[Analysis]/plugin_out/[Plugin]/ - Storage location for plugin specific output files
- ./Alignment_out/ - Storage location for the realignment plugin
- *.bam - BAM file containing alignment data against the specificed secondary reference
- ./variantCalling_out/ - Storage location for the variant caller plugin
- *.vcf.gz - VCF file containing variant calls from the variant call pipeline
- /opt/ion/iondb/ - Primary location for Torrent Suite Software executables
- TLScript.py - Master script which drives the initial analysis from DAT to BAM
- /etc/init.d/ - Primary location for Torrent Server service scripts
- ionCrawler - Responsible for looking at the local file structure and identifying when a new run has been copied over by looking for the explog.txt file and the presence of the last DAT file and inserting the new run into the database
- ionJobServer - Responsible for submitting primary analysis jobs to SGE
- ionPlugin - Same as ionJobServer only it drives the plugin analyses
- /var/log/ion/ - Primary storage location for ion specfic process logs
- crawl.log - ionCrawler service log
- jobserver.log - ionJobServer service log
- ionPlugin.log - ionPlugin service log
- iarchive.log - Archive service log
- tsconf - TSconfig service log
Data Formats
More information is available at Ion Torrent File Formats.DAT
More information can be found at File Format - Raw Image Acquisition File (DAT).WELLS
More information can be found at File Format - 1.wells files (WELLS).SFF
Standard Flowgram Format (SFF) is a standard binary format used encode the sequence data in flow space. More information can be found at File Format - Sequence Files (SFF and FASTQ) and here.FASTQ
A standard text-based format for storing raw sequence information.@J16EU:4:72
CCTCACCCGCCGTCACGTGATGAAAGGATTACTGCTGTTGCTCGGCGCTGGCGGAGGCTGGCAGCTCTGGCAGTC
+
4144655,7715777778997788,3366168858777377755/444551563773654.100.'.....0...
SAM/BAM
Sequence Alignment / Map (SAM) format is a standard text-based format for storing alignment results. BAM is the binary equivalent of SAM.J16EU:755:498 0 gi|49175990|ref|NC_000913.2| 8 70 39M1D42M16S * 0 0
CATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCGTGCTGAGACTGACAG
?<?;??=???>??;?>>6;77;;:;<;=;=<==;?;??;?????':8:==<==>><=<9;9:94:9<<6:;:/4052/2...9;;;=
<;;:752/.. RG:Z:J16EU PG:Z:tmap MD:Z:39^A42 NM:i:1 AS:i:74 XA:Z:map3-1
XS:i:-2147483647 XT:i:56
VCF
Variant Call Format (VCF) is a standard text-based format for storing information on genomic variations (e.g., SNPs and Indels).#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT snappqc
gi|49175990|ref|NC_000913.2| 547694 . A G 59 . DP=9;AF1=1;CI95=1,1;
DP4=0,0,2,7;MQ=56;FQ=-54 GT:PL:GQ 1/1:92,27,0:51
Very good information, where did you get all of this?
ReplyDeleteVery useful. Thank you for the effort.
ReplyDeleteThanks for the information!
ReplyDeleteI wonder if there is any Ion Proton sequence data available yet ?
Thank you so much!
ReplyDeleteThanks for the blog! In case anyone is still trying to use the above links and finds them broken, the Ion Community is now hosted here: https://ioncommunity.lifetechnologies.com/welcome
ReplyDeleteAnd by the end of 2015 will migrate to https://ioncommunity.thermofisher.com/
Dear Ion Torrent employee,
Deletevery good remark. Do you by any chance know, if the document describing the DAT file format ( File Format - Raw Image Acquisition File (DAT) ) is still somewhere available ?
Thanks ! Elain
Dear Kevin,
ReplyDeletevery nice info collection !
Unfortunatly this link to the File Format - Raw Image Acquisition File (DAT)
is broken (and the file format description does not seem to be on the new repositories above). Do you have an idea, where I get some info about the DAT file structure ?
Thanks a lot for you help, Elain !