Kevin's GATTACA World: variant

Showing posts with label variant. Show all posts

Monday, 10 December 2012

Gabe Rudy "GATK is a Research Tool. Clinics Beware." | Our 2 SNPs…(R)

http://blog.goldenhelix.com/?p=1534

Gabe points out in great detail a bug he found in GATK's variant caller which has be widely regarded as a reliable SNP caller.

I think in general the 'unreliable' nature of next gen seq data has researchers often seeking multiple sources of confirmation for variants before moving to publication.

though I am frankly surprised that GATK turned up an error but as Gabe points out it might be common to find Heisen Bugs in software.

and it's a poignant reminder that DTC genetic testing needs more work to avoid mistakes like these that might be detrimental to personalised medicine

"But my scary homozygous insertion (row 2) shows 153 reference bases and no reads supporting the insertion. Yet it was still called a homozygous variant!

I promptly sent an email off to 23andMe's exome team letting them know about what is clearly a bug in the GATK variant caller. They confirmed it was a bug that went away after updating to a newer release. I talked to 23andMe's bioinformatician behind the report face-to-face a bit at this year's ASHG conference, and it sounds like it was most likely a bug in the tool's multi-sample variant calling mode as this phantom insertion was a real insertion in one of the other samples.

Since there were 8,242 other InDels that match this pattern, I am most likely not looking at random noise but real "leaked" variants from other members of the 23andMe Exome Pilot Program. (Edit: After some analysis with a fixed version of GATK, Eoghan from 23andMe found that these genotypes where not leaked from other samples but completely synthetic.)"

Thursday, 16 February 2012

How Identical are Identical Twins? | Read Through Transcription

How Identical are Identical Twins?
December 19, 2011 by Ramesh Hariharan
We're looking at exome sequencing data on whole peripheral blood DNA of monozygotic twins (this data was generated by our collaborators, Jan Dumanski and his group at Uppsala University in Sweden). Monozygotic twins were earlier thought to be genetically identical; now we know that isn't completely true. How does one identify small mutations (SNPs and small InDels) that are present in one of the twins but not in the other? Or in general, how does one compare two different samples, for instance, to find somatic mutations that are present in a tumor sample but not present in the paired normal sample.
...
The 1000 Genomes Project estimated that a child has only around 50 new mutations relative to its parents. Monozygotic twins ought to be closer than that. And we are observing only the exomes (and some neighborhood) of these twins. So the real answer probably lies close to the bottom of the above table. However, as Jan Dumanski points out, much of the 1000 Genomes effort involved sequencing of oligoclonal/monoclonal lymphoblastoid cell lines, not quite directly comparable with whole peripheral blood.

http://blog.avadis-ngs.com/2011/12/how-identical-are-identical-twins/

gosh who knew calling SNPs on identical twins can be a complicated task??

update:
related links I found
Different sides of the same coin; twins and epigenetics

http://blogs.dnalc.org/2011/09/23/different-sides-of-the-same-coin-twins-and-epigenetics/

Kevin's GATTACA World

Monday, 10 December 2012

Gabe Rudy "GATK is a Research Tool. Clinics Beware." | Our 2 SNPs…(R)

Thursday, 16 February 2012

How Identical are Identical Twins? | Read Through Transcription

Datanami, Woe be me

Analytics code

Contributors