Friday, 28 June 2013

Lessons learned from implementing a national infrastructure in Sweden for storage and analysis of next-generation sequencing data

Each time when I change jobs, I will have to go through the adventure (and sometimes pain) to relearn about the computing resources available to me (personal), lab (small sharing pool), and the entire institute/company/school (Not enought to go around usually).
Depending on the job scope / number of cores / length of the job I would then setup the computing resources to run on either of the 3 resources available to me.
Sometimes, grant money appears magically and I am asked by my boss what do I need to buy (ok TBH  this is rare). Hence it's always nice to keep a lookout on what's available on the market and who's using what to do what. So that one day when grant money magically appears, I won't be stumped for an answer.

excerpted from the provisional PDF are three points which I agree fully

Three GiB of RAM per core is not enough
you won't believe the number of things I tried to do to outsmart the 'system' just to squeeze enough ram for my jobs. Like looking for parallel queues which often have a bigger amount of RAM allocation. Doing tests for small jobs to make sure it runs ok before scaling it up and have it fail after two days due to insufficient RAM.
MPI is not widely used in NGS analysis
A lot of the queues in the university shared resource has ample resources for my jobs but were reserved for MPI jobs. Hence I can't touch those at all.
A central file system helps keep redundancy to a minimum
balancing RAM / compute cores to make the job splitting efficient was one thing. The other pain in the aXX was having to move files out of the compute node as soon as the job is done and clear all intermediate files. There were times where the job might have failed but as I deleted the intermediate files in the last step of the pipeline bash script, I wasn't able to be sure it ran to completion. In the end I had to rerun the job and keeping the intermediate files

anyway for more info you can check out the below

Lessons learned from implementing a national infrastructure in Sweden for storage and analysis of next-generation sequencing data

Samuel LampaMartin DahlöPall I OlasonJonas Hagberg and Ola Spjuth
For all author emails, please log on.
GigaScience 2013, 2:9 doi:10.1186/2047-217X-2-9
Published: 25 June 2013

Abstract (provisional)

Analyzing and storing data and results from next-generation sequencing (NGS) experiments is a challenging task, hampered by ever-increasing data volumes and frequent updates of analysis methods and tools. Storage and computation have grown beyond the capacity of personal computers and there is a need for suitable e-infrastructures for processing. Here we describe UPPNEX, an implementation of such an infrastructure, tailored to the needs of data storage and analysis of NGS data in Sweden serving various labs and multiple instruments from the major sequencing technology platforms. UPPNEX comprises resources for high-performance computing, large-scale and high-availability storage, an extensive bioinformatics software suite, up-to-date reference genomes and annotations, a support function with system and application experts as well as a web portal and support ticket system. UPPNEX applications are numerous and diverse, and include whole genome-, de novo- and exome sequencing, targeted resequencing, SNP discovery, RNASeq, and methylation analysis. There are over 300 projects that utilize UPPNEX and include large undertakings such as the sequencing of the flycatcher and Norwegian spruce. We describe the strategic decisions made when investing in hardware, setting up maintenance and support, allocating resources, and illustrate major challenges such as managing data growth. We conclude with summarizing our experiences and observations with UPPNEX to date, providing insights into the successful and less successful decisions made.

The complete article is available as a provisional PDF. The fully formatted PDF and HTML versions are in production.

Thursday, 27 June 2013

Data Analyst - Asia Product Vigilance Procter & Gamble - SG-Singapore-Singapore (Singapore)

Haven't been looking at jobs for a while .. here's something that caught my eye .. typically I feel that pharma jobs are more detailed and pertinent in their hiring criteria since they have a very good idea of what kind of skills they want for the person they are hiring .. it's no different here .. so not much of a surprise. The biggest surprise was that they would only consider a bachelor's degree holder only if they have 8 years of relevant experience.  Whether that's equivalent, I shall leave you to decide ..

Job Description


This position will be part of the Global Safety Surveillance and Analysis organization. The successful job incumbent will query consumer complaints and adverse health effects databases for purposes of signal detection and ad hoc requests by internal customers (Product Vigilance-PV Managers, Product Development, Business Team, Central Product Safety, etc.), and develop customized reports to interpret data and identify potential safety issues as well as health concerns for in-market products.


- Querying large health effects datasets to identify safety signals and trends

- Generating periodic safety reports with different levels of specificities
and focuses

- Summarizing and interpreting safety data together with PV managers

- Developing and applying statistical methods for various data analysis needs

- Continuously improving current data mining methodology, tools, and systems together with global work force.

- Masters Degree in applied statistics or data mining fields with relevant data analysis experiences or advanced degree in human health or clinically oriented informatics research. Bachelor Degree holders with at least 8 years of relevant years can also be considered.

- Experience with electronic data capture systems and analytical tools, including manipulation and visualization of data using reporting tools. Statistical analysis and database operation experience is preferred.

- Excellent written and oral communication skills in English, with emphasis on communication of human health relevance, safety and clinical information

- Excellent organization skills, attention to details and ability to manage complex systems.

- Good collaboration skills: experience establishing and maintaining global and cross-functional working relationships.

Company Description

About Procter & Gamble

P&G serves approximately 4.6 billion people around the world with its brands. The Company has one of the strongest portfolios of trusted, quality, leadership brands, including Pampers®, Tide®, Ariel®, Always®, Whisper®, Pantene®, Mach3®, Bounty®, Dawn®, Fairy®, Gain®, Charmin®, Downy®, Lenor®, Iams®, Crest®, Oral-B®, Duracell®, Olay®, Head & Shoulders®, Wella®, Gillette®, Braun®, Fusion®, Ace®, Febreze®, Ambi Pur®, SK-II®, and Vicks®. The P&G community includes operations in approximately 75 countries worldwide. Please visit for the latest news and in-depth information about a career at P&G!

Additional Information

June 17, 2013
Entry level
Science, Research 
Consumer Goods 
Employer Job ID:
Job ID:

Tuesday, 4 June 2013

Bloomberg: MRI for $7,332 Shows Wide Variety in U.S. Medical Costs

There was a kaggle contest for making predictions on patient data to predict return visits if i recall correctly. Wonder what was the aim of this "datapalooza" but this side finding does raises questions about healthcare costing. 
Alternatively it might just simply mean that some numbers were wrongly entered into the database. I for one have always been lost in the maze of medical receipts where some items are grouped together and some are separate. 
Lesson to be learnt. Important to learn about your primary data.  

From Bloomberg, 4 Jun, 2013 4:45:54 AM

The costs of outpatient hospital care vary widely for typical services such as an MRI, according to data released by the U.S. government.

To read the entire article, go to
Sent from the Bloomberg iPad application. Download the free application at

Sent from my iPad

Bloomberg: Michael Douglas Oral Sex Cancer Claim Spurs Vaccine Calls

Wow another celebrity "medical advice endorsement" 

but it makes sense to immunise guys for the purpose of stemming the spread of the disease and lowering costs of the vaccine for a prevalent disease imho 

From Bloomberg, 3 Jun, 2013 8:51:50 PM

Michael Douglas's claim that oral sex led to his throat cancer is spurring calls to vaccinate more boys as well as girls against the human papilloma virus that causes the malignancy.

To read the entire article, go to
Sent from the Bloomberg iPad application. Download the free application at

Sent from my iPad

Datanami, Woe be me