Monday, 8 April 2019

FAAH-OUT This woman feels no pain!

"A woman in Scotland can feel virtually no pain due to a mutation ...
At age 65, the woman sought treatment for an issue with her hip, which turned out to involve severe joint degeneration despite her experiencing no pain. At age 66, she underwent surgery on her hand, which is normally very painful, and yet she reported no pain after the surgery. Her pain insensitivity was diagnosed by Dr Devjit Srivastava, Consultant in Anaesthesia and Pain Medicine at an NHS hospital in the north of Scotland and co-lead author of the paper....
which the researchers have described for the first time and dubbed FAAH-OUT. "

Journal Reference:

    Abdella M. Habib, Andrei L. Okorokov, Matthew N. Hill, Jose T. Bras, Man-Cheung Lee, Shengnan Li, Samuel J. Gossage, Marie van Drimmelen, Maria Morena, Henry Houlden, Juan D. Ramirez, David L.H. Bennett, Devjit Srivastava, James J. Cox. Microdeletion in a pseudogene identified in a patient with high anandamide concentrations and pain insensitivity. British Journal of Anaesthesia, 2019; DOI: 10.1016/j.bja.2019.02.019

Thursday, 28 March 2019

1-liner bash to rename spaces in filenames or folder names

#non-recursive method
for f in *\ *; do mv "$f" "${f// /_}"; done

HDD clean up; Does it spark joy?

in Windows Explorer to look for > 128 Mb sized files

Saturday, 29 September 2018

Koala Genome assembled on AWS

Excerpted from AWS blog 
Five years ago, a research team led by Dr. Rebecca Johnson (Director of the Australian Museum Research Institute) set out to learn more about koala populations, genetics, and diseases. As a biologically unique animal with a limited appetite, maintaining a healthy and genetically diverse population are both key elements of any conservation plan. In addition to characterizing the genetic diversity of koala populations, the team wanted to strengthen Australia’s ability to lead large-scale genome sequencing projects.
Inside the Koala Genome
Last month the team published their results in Nature Genetics. Their paper (Adaptation and Conservation Insights from the Koala Genome) identifies the genomic basis for the koala’s unique biology. 

This work was performed on AWS. The research team used cfnCluster to create multiple clusters, each with 500 to 1000 vCPUs, and running Falcon from Pacific Biosciences. All in all, the team used 3 million EC2 core hours, most of which were EC2 Spot Instances.

Tuesday, 11 September 2018

BioBloom tools: fast, accurate and memory-efficient host species sequence screening using bloom filters

Bioinformatics, Volume 30, Issue 23, 1 December 2014, Pages 3402–3404,
20 August 2014


Large datasets can be screened for sequences from a specific organism, quickly and with low memory requirements, by a data structure that supports time- and memory-efficient set membership queries. Bloom filters offer such queries but require that false positives be controlled. We present BioBloom Tools, a Bloom filter-based sequence-screening tool that is faster than BWA, Bowtie 2 (popular alignment algorithms) and FACS (a membership query algorithm). It delivers accuracies comparable with these tools, controls false positives and has low memory requirements.

Tuesday, 20 March 2018

JD: Sr. Software DevOps Engineer at Guardant Health
Gotta love this line 
“We wanted flying cars and instead we got 140 characters” is a much-repeated complaint about Silicon Valley. But with all due respect to flying cars, we believe that our mission is even more critical. 

notable skills in the JD to pursue 
Ansible / Chef

This paragraph sounds exactly like what I face on a daily basis

Your troubleshooting skills are excellent, and you enjoy a good daily challenge in supporting rapid growth and a diverse set of end user needs. You have the ability to maintain day to day support while running various key projects that move the business forward by automating and creating new tools that facilitate management of the environment.

Friday, 23 February 2018

Exploring the 1000 genome dataset with Hail on Amazon EMR and Amazon Athena

 Blog post from Roy Hasson

Genomics analysis has taken off in recent years as organizations continue to adopt the cloud for its elasticity, durability, and cost. With the AWS Cloud, customers have a number of performant options to choose from. These options include AWS Batch in conjunction with AWS Lambda and AWS Step Functions; AWS Glue, a serverless extract, transform, and load (ETL) service; and of course, the AWS big data and machine learning workhorse Amazon EMR.
For this task, we use Hail, an open source framework for exploring and analyzing genomic data that uses the Apache Spark framework. In this post, we use Amazon EMR to run Hail. We walk through the setup, configuration, and data processing. Finally, we generate an Apache Parquet–formatted variant dataset and explore it using Amazon Athena.

Datanami, Woe be me