Kevin's GATTACA World: pyrosequencing

Showing posts with label pyrosequencing. Show all posts

Saturday, 12 March 2011

Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons [RESOURCES]

from Genome Research current issue by Haas, B. J., Gevers, D., Earl, A. M., Feldgarden, M., Ward, D. V., Giannoukos, G., Ciulla, D., Tabbaa, D., Highlander, S. K., Sodergren, E., Methe, B., DeSantis, T. Z., The Human Microbiome Consortium, Petrosino, J. F., Knight, R., Birren, B. W.

Bacterial diversity among environmental samples is commonly assessed with PCR-amplified 16S rRNA gene (16S) sequences. Perceived diversity, however, can be influenced by sample preparation, primer selection, and formation of chimeric 16S amplification products. Chimeras are hybrid products between multiple parent sequences that can be falsely interpreted as novel organisms, thus inflating apparent diversity. We developed a new chimera detection tool called Chimera Slayer (CS). CS detects chimeras with greater sensitivity than previous methods, performs well on short sequences such as those produced by the 454 Life Sciences (Roche) Genome Sequencer, and can scale to large data sets. By benchmarking CS performance against sequences derived from a controlled DNA mixture of known organisms and a simulated chimera set, we provide insights into the factors that affect chimera formation such as sequence abundance, the extent of similarity between 16S genes, and PCR conditions. Chimeras were found to reproducibly form among independent amplifications and contributed to false perceptions of sample diversity and the false identification of novel taxa, with less-abundant species exhibiting chimera rates exceeding 70%. Shotgun metagenomic sequences of our mock community appear to be devoid of 16S chimeras, supporting a role for shotgun metagenomics in validating novel organisms discovered in targeted sequence surveys.

Friday, 4 March 2011

Guide/tutorial for the analysis of RNA-seq data

link in seqanswers

Excellent starting point for those confused about the RNA-seq data analysis procedure.

Hello,

I've written a guide to the analysis of RNA-seq data, for the purpose of differential expression analysis. It currently lives on our internal wiki that can't be viewed outside of our division, although printouts have been used at workshops. It is by no means perfect and very much a work in progress, but a number of people have found it helpful, so I thought it would useful to have it somewhere more publicly accessible.

I've attached a pdf version of the guide, although really what I was hoping was that someone here could suggest somewhere where it could be publicly hosted as a wiki. This area is so multifaceted and fast-moving that the only way such a guide can remain useful is if it can be constantly extended and updated.

If anyone has any suggestions about potential hosting, they can contact me at myoung @wehi.edu.au

Cheers

Matt

Update: I've put a few extra things on our local Wiki and seeing as people here seem to be finding this useful I thought I'd post an updated version. I'm also an author on a review paper on Differential Expression using RNA-seq which people who find the guide useful, might also find relevant...

RNA-seq Review

Wednesday, 2 March 2011

Papers on Comparison of microRNA profiling platforms

Systematic Evaluation of Three microRNA Profiling Platforms: Microarray, Beads Array, and Quantitative Real-Time PCR Array

Background

A number of gene-profiling methodologies have been applied to microRNA research. The diversity of the platforms and analytical methods makes the comparison and integration of cross-platform microRNA profiling data challenging. In this study, we systematically analyze three representative microRNA profiling platforms: Locked Nucleic Acid (LNA) microarray, beads array, and TaqMan quantitative real-time PCR Low Density Array (TLDA).

Systematic comparison of microarray profiling, real-time PCR, and next-generation sequencing technologies for measuring differential microRNA expression

Abstract
RNA abundance and DNA copy number are routinely measured in high-throughput using microarray and next-generation sequencing (NGS) technologies, and the attributes of different platforms have been extensively analyzed. Recently, the application of both microarrays and NGS has expanded to include microRNAs (miRNAs), but the relative performance of these methods has not been rigorously characterized. We analyzed three biological samples across six miRNA microarray platforms and compared their hybridization performance. We examined the utility of these platforms, as well as NGS, for the detection of differentially expressed miRNAs. We then validated the results for 89 miRNAs by real-time RT-PCR and challenged the use of this assay as a “gold standard.” Finally, we implemented a novel method to evaluate false-positive and false-negative rates for all methods in the absence of a reference method.

Tuesday, 21 September 2010

Simple Copy Number Determination with Reference Query Pyrosequencing (RQPS)

Zhenyi Liu¹^,3, Daniel L. Schneider¹, Kerry Kornfeld¹, and Raphael Kopan¹^,2^,3 ¹ Department of Developmental Biology, School of Medicine, Washington University, St. Louis, Missouri 63110, USA
² Division of Dermatology, Department of Medicine, School of Medicine, Washington University, St. Louis, Missouri 63110, USA

The accurate measurement of the copy number (CN) for an alleleis often desired. We have developed a simple pyrosequencing-basedmethod, reference query pyrosequencing (RQPS), to determinethe CN of any allele in any genome by taking advantage of thefact that pyrosequencing can accurately measure the molar ratioof DNA fragments in a mixture that differ by a single nucleotide.The method involves the preparation of an RQPS probe, whichcontains two linked DNA fragments that match a reference allelewith a known CN and a query allele with an unknown CN.

Sunday, 30 May 2010

Cofactor genomics on the different NGS platforms

Original post here

They are a commercial company that offers NGS on ABI and Illumina platforms and since this is on their company page I guess its their official stand on what rocks on each platform

Excerpted.

Applied Biosystems SOLiD 3

The Applied Biosystems SOLiD 3 has the shortest but also the highest quantity of reads. The SOLiD produces up to 240 million 50bp reads per slide per end. As with the Illumina, Mate-Pairs produce double the output by duplicating the read length on each end, and the SOLiD supports a variety of insert lengths like the 454. The SOLiD can also run 2 slides at once to again double the output. SOLiD has the lowest *raw* base qualities but the highest processed base qualities when using a reference due to its 2-base encoding. Because of the number of reads and more advanced library types, we recommend the SOLiD for all RNA and bisulfite sequencing projects.

Solexa/Illumina

The Solexa/Illumina generates shorter reads at 36-75bp but produces up to 160 million reads per run. All reads are of similar length. The Illumina has the highest *raw* quality scores and its errors are mostly base substitutions. Paired-end reads with ~200 bp inserts are possible with high efficiency and double the output of the machine by duplicating the read length on each end. Paired-end Illumina reads are suitable for de novo assemblies, especially in combination with 454. The large number of reads makes the Illumina appropriate for de novo transcriptome studies with simultaneous discovery and quantification of RNAs at qRT-PCR accuracy.

Roche/454 FLX

The Roche/454 FLX with Titanium chemistry generates the longest reads (350-500bp) and the most contiguous assemblies, can phase SNPs or other features into blocks, and has the shortest run times. However, 454 also produces the fewest total reads (~1 million) at the highest cost per base. Read lengths are variable. Errors occur mostly at the ends of long same-nucleotide stretches. Libraries can be constructed with many insert sizes (8kb - 20kb) but at half of the read length for each end and with low efficiency.

Kevin's GATTACA World

Saturday, 12 March 2011

Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons [RESOURCES]

Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons [RESOURCES]

Friday, 4 March 2011

Guide/tutorial for the analysis of RNA-seq data

Wednesday, 2 March 2011

Papers on Comparison of microRNA profiling platforms

Background

Tuesday, 21 September 2010

Simple Copy Number Determination with Reference Query Pyrosequencing (RQPS)

Simple Copy Number Determination with Reference Query Pyrosequencing (RQPS)

Sunday, 30 May 2010

Cofactor genomics on the different NGS platforms

Applied Biosystems SOLiD 3

Solexa/Illumina

Roche/454 FLX

Datanami, Woe be me

Analytics code

Contributors