Saturday, 4 April 2015

four mistakes to avoid if you are analyzing data: useful reminder!

I have been following plotly for a while ... haven't had public data that I can play with on this though ... anyway they have a blog post that is a compilation of the 4 deadly sins of data analysis. Do check it out!

Saturday, 15 November 2014

What 5G mobile networks portends for the future of personal genomics

ok I saw this a while back (a month ago, yes I have been busy)

I am already very impressed with 4G (LTE) speeds but with 5G you can possibly achieve 150 mb/s to 940 mb/s which is mind blowing ...

Considering that you could then possibly upload via your mobile devices, your own 100 Gb bam file in about 10 seconds (sorry I wasn't thinking how much faster a youtube video would stream). Now Google is saying that they can store your genome (actually they meant your 30x WGS bam file) for $25 a year. But with 5G speeds, why would I even bother with that?

Heck, maybe in the future with an USB OTG cable connected to Oxford Nanopore's MinION your android phone will be able to sequence and upload in realtime your DNA obtained from a buccal swab. The cloud will have the fastq reads aligned and call variants instantaneously and download the 100 Gb bam to your microsd card.

Possible applications:

  1. Maybe in the future other than asking if you have a drug allergy, pharmacists will request to 'scan' your DNA for the most efficient drug. 
  2. another possible application might be having your DNA be your own personal identity card, 
  3. more routine sequencing of the human microbiome to monitor your health in relation to the gut microflora or other sites.

I am keen to find out what you think you can do if you could carry your whole genome sequencing with you and upload via mobile networks. Drop in your comments please!

Saturday, 13 September 2014

Wednesday, 27 August 2014

tabix and VCF file size limits

Today I learn that tabix can index bgzipped VCF files of 4 TB (compressed) and possibly bigger.... Mind blown ...

Source: Samtools-help mailing list

Friday, 2 May 2014

Monday, 21 April 2014

Fwd: Welcome to the Google Genomics Preview


---------- Forwarded message ----------

Welcome to the Google Genomics Preview! You've been approved for early access to the API.

The goal of the Genomics API is to encourage interoperability and build a foundation to store, process, search, analyze and share tens of petabytes of genomic data.

We've loaded sample data from public BAM files:

* The complete 1000 Genomes Project

* Selections from the Personal Genome Project

How to get started:

* Follow the instructions in the developer documentation

* Try the sample genome browser which calls the API

* Try out the other open source examples -- an R script, Python MapReduce, and a Java file-based implementation

* Write your own code to call the API and explore new uses

This is only the beginning. Your feedback will be essential to make the API useful. Please submit feature requests, bugs and suggestions on our GitHub page.

Thank you for being part of the first wave. If you'd rather join with a different email address (Gmail or Google Apps domain), please fill out the request form with that address too, and we'll grant access soon. Thank you for your interest!


The Google Genomics team

Datanami, Woe be me