Thursday, 31 January 2013

A hybrid likelihood model for sequence-based disease association studies.

PLoS Genet. 2013 Jan;9(1):e1003224. doi: 10.1371/journal.pgen.1003224. Epub 2013 Jan 24.

A hybrid likelihood model for sequence-based disease association studies.

Chen YC, Carter H, Parla J, Kramer M, Goes FS, Pirooznia M, Zandi PP, McCombie WR, Potash JB, Karchin R.

Source

Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland, United States of America.

Abstract

In the past few years, case-control studies of common diseases have shifted their focus from single genes to whole exomes. New sequencing technologies now routinely detect hundreds of thousands of sequence variants in a single study, many of which are rare or even novel. The limitation of classical single-marker association analysis for rare variants has been a challenge in such studies. A new generation of statistical methods for case-control association studies has been developed to meet this challenge. A common approach to association analysis of rare variants is the burden-style collapsing methods to combine rare variant data within individuals across or within genes. Here, we propose a new hybrid likelihood model that combines a burden test with a test of the position distribution of variants. In extensive simulations and on empirical data from the Dallas Heart Study, the new model demonstrates consistently good power, in particular when applied to a gene set (e.g., multiple candidate genes with shared biological function or pathway), when rare variants cluster in key functional regions of a gene, and when protective variants are present. When applied to data from an ongoing sequencing study of bipolar disorder (191 cases, 107 controls), the model identifies seven gene sets with nominal p-values[Formula: see text]0.05, of which one MAPK signaling pathway (KEGG) reaches trend-level significance after correcting for multiple testing.

PMID:: 23358228; [PubMed - in process]

http://www.ncbi.nlm.nih.gov/pubmed/23358228

Fast and accurate read mapping with approximate seeds and multiple backtracking.

Nucleic Acids Res. 2013 Jan 28. [Epub ahead of print]

Fast and accurate read mapping with approximate seeds and multiple backtracking.

Siragusa E, Weese D, Reinert K.

Source

Department of Mathematics and Computer Science, Freie Universität Berlin, Takustr. 9, 14195 Berlin, Germany and Max Planck Institute for Molecular Genetics, Ihnestr. 63-73, 14195 Berlin, Germany.

Abstract

We present Masai, a read mapper representing the state-of-the-art in terms of speed and accuracy. Our tool is an order of magnitude faster than RazerS 3 and mrFAST, 2-4 times faster and more accurate than Bowtie 2 and BWA. The novelties of our read mapper are filtration with approximate seeds and a method for multiple backtracking. Approximate seeds, compared with exact seeds, increase filtration specificity while preserving sensitivity. Multiple backtracking amortizes the cost of searching a large set of seeds by taking advantage of the repetitiveness of next-generation sequencing data. Combined together, these two methods significantly speed up approximate search on genomic data sets. Masai is implemented in C++ using the SeqAn library. The source code is distributed under the BSD license and binaries for Linux, Mac OS X and Windows can be freely downloaded from http://www.seqan.de/projects/masai.

PMID:: 23358824; [PubMed - as supplied by publisher]

http://www.ncbi.nlm.nih.gov/pubmed/23358824

Friday, 25 January 2013

Fwd: [Biopython] Debian Med Sprint in Kiel, Germany 23rd/24th of February

Plug for Debian Med Sprint
---------- Forwarded message ----------
From: Steffen Möller <steffen_moeller gmx.de>
Date: Jan 25, 2013 11:25 PM
Subject: [Biopython] Debian Med Sprint in Kiel, Germany 23rd/24th of February
To: "Biopython Mailing List" <biopython lists.open-bio.org>
Cc:

> Dear all,
>
> We have our annual Debian/Ubuntu/Bio-Linux sprint on Bioinformatics again next month. Every year there are a few individuals more peripheral to the distribution attending, which usually helps us to develop our community further in some way. Anybody from BioPython interested to join in, please read through
> http://wiki.debian.org/DebianMed/Meeting/Kiel2013
> and just email me or add him/herself. There is not anything particular that I expect from the BioPython community, except for more and better ideas on how to develop research on and with tools in computational biology further.
> Registration is free. Accommodation and travel are not.
>
> Cheers,
>
> Steffen
> _______________________________________________

Saturday, 19 January 2013

Watch out for cars and lung cancer

http://mendeliandisorder.blogspot.sg/2012/11/why-i-dont-want-to-know-my-genome.html

http://blogs.plos.org/dnascience/2012/11/01/why-i-dont-want-to-know-my-genome-sequence/

Interesting reads on a rainy Saturday.

I think (at this point in time) believing whole genome sequencing or even exome seq is the way forward in medical health is akin to buying extended warranty.
You don't need it now but you are banking on having cost savings when u actually do (doing one whole genome versus small individual regions)

No doubt eventually when prescription of drugs depends on your genetic make up, your DNA sequences will be invaluable or even compulsory. (Before I read this article I didn't even know being slow to metabolize anti psychotics and beta blockers can be deadly). Right now, genomics offer a glimpse into likely causal associations which can be hard for the man on the street to act on, beyond the advice of " don't smoke, exercise, eat a healthy diet, and don't worry about DNA sequences"

I would also add "watch out for cars" since 1.3 million people die yearly from auto accidents versus 1.4 million deaths attributed to lung cancer.

See
http://www.who.int/mediacentre/factsheets/fs358/en/index.html
http://www.cancerresearchuk.org/cancer-info/cancerstats/world/the-global-picture/

Thursday, 17 January 2013

Article: DSK: k-mer counting with very low memory usage

DSK: k-mer counting with very low memory usage
http://bioinformatics.oxfordjournals.org/content/early/2013/01/16/bioinformatics.btt020.short?buffer_share=64cbf&rss=1

We present a new streaming algorithm for k-mer counting, called DSK (diskstreaming of k-mers), which only requires a fixed, user-defined amount of memory and disk space. This approach realizes a memory, time and disk trade-off. The multi-set of all k-mers present in the reads is partitioned and partitions are saved to disk. Then, each partition is separately loaded in memory in a temporary hash table. The k-mer counts are returned by traversing each hash table. Low-abundance k-mers are optionally filtered.

DSK is the first approach that is able to count all the 27-mers of a human genome dataset using only 4.0 GB of memory and moderate disk space (160 GB), in 17.9 hours. DSK can replace a popular k-mer counting software (Jellyfish) on small-memory servers.

Availability:http://minia.genouest.org/dsk

Sent via Flipboard

Sent from myPhone

Article: Fecal Microbiota Transplantation — An Old Therapy Comes of Age — NEJM

Fascinating!
Fecal Microbiota Transplantation — An Old Therapy Comes of Age — NEJM
http://www.nejm.org/doi/full/10.1056/NEJMe1214816?query=TOC&#article

Sent via Flipboard

Sent from myPhone

Wednesday, 9 January 2013

Article: Genomic basis for coral resilience to climate change

Genomic basis for coral resilience to climate change
http://www.pnas.org/content/early/2013/01/02/1210224110.short?buffer_share=f163c&rss=1

Different corals differ substantially in physiological resilience to environmental stress, but the molecular mechanisms behind enhanced coral resilience remain unclear. Here, we compare transcriptome-wide gene expression (via RNA-Seq using Illumina sequencing) among conspecific thermally sensitive and thermally resilient corals to identify the molecular pathways contributing to coral resilience
Sent via Flipboard

Sent from myPhone

Kevin's GATTACA World

Thursday, 31 January 2013