Thursday, 5 April 2012

FAQ: Annotation of reads, Tophat, Galaxy

Another post by Jennifer that I feel is a frequently asked question that should be on a wiki somewhere.

---------- Forwarded message ----------
From: Jennifer Jackson


The tools TopHat/Cufflinks will map and assemble transcripts from sequencing reads. This mapping give each component (short read, transcript, gene boundary) genomic coordinates with respect to the target reference genome.

Annotation is also mapped to the reference genome by genomic coordinates. This can be derived from different sources, a look at a genome browser project that focuses on annotation will help you to understand the concept. Good choices can be found under the Galaxy tool group "Get Data".

One way to merge the two (assign "annotation" to a "sequence/transcript/gene"), is to identify overlapping coordinate regions on the reference genome between the two. Please see the tools in the group "Operate on Genomic Intervals" and the associated wiki for help/choices Galaxy is a good resource for this type of analysis.

Another way to obtain annotation is to run annotation algorithms directly on the sequence data itself. This is a large and varied analysis space. The public main Galaxy server has some tools for this type of analysis and more are offered if you decided to run a local/cloud instance with repositories from the Tool Shed.

For annotation, it is best to know what you are looking for, perform some searches both within the web tools you prefer and with a search tool such as Galaxy, use that research to determine the best platform to use the tool, then sort out the technical details. For general technical 'how-to-use' help with Galaxy, plus some basic scientific operations, these are good places to get oriented/started:


Galaxy team

On 4/3/12 7:30 AM, hsharm wrote:
> Dear galaxy users,
> This might be a very basic question to most of you. But I was hopimg I
> could get better understanding of this concept by asking you all.
> How exactly can we accomplish annotation of our reads? The combination
> of Tophat and cufflinks does annotate genes right? . I am a bit confused
> regarding this topic. Any help will be much appreciated.
> Thanks.

The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

No comments:

Post a Comment

Datanami, Woe be me