Friday, 8 December 2017


15 months ago by
cshevlin  40
Spotted this ad in Biostars ..

The IBM Watson Health business division is now looking for talented individuals destined to usher in the next era of healthcare. We live in a moment of remarkable change and opportunity. The convergence of data and technology is transforming healthcare and life sciences organizations in every way. New roles are being created that never existed before to meet the demands of this transformation.
We are now looking for a Genomic Data Scientist to join our team.
You will have an opportunity to work directly with the team building new healthcare solutions using genomic analytics and serving oncologists, pathologists and other specialists caring for patients. You will help define, design, and build those solutions and apply your expertise to work in different analytical and statistical models.
Key Responsibilities: Develop tools to transform load and validate data Strategizes new uses for data and its interaction with data design Perform data studies of new and diverse data sources Find new uses for existing data sources Discover “stories” told by the data and presents them to other scientists and business managers Generate algorithms and create computer models**
Ideal Candidates will possess the following:
Candidates should foremost have a strong background in data mining and statistics. Hands-on background in programming and using databases and tools to mine data including practical experience in extracting, transforming and load data as well as developing statistical and analytical models. Candidates must have demonstrated capacity to adapt to demanding and high pressure projects and adaptability to client’s needs. Background on bioinformatics Experience in Healthcare or Life Sciences
Learn more about IBM Watson Health and what we are doing …. And apply Now to explore this opportunity with us!
*U.S. Department of Veterans Affairs Enlists IBM’s Watson in the War on Cancer Public-Private Partnership Will Help Doctors Scale Precision Medicine Access for up to 10,000 VA Cancer Patients
IBM and New York Genome Center’s new cancer tumor repository aims to revolutionize treatment
IBM's Watson to help doctors devise optimal cancer treatment
Employment Type
Required Technical and Professional Expertise At least 2 years of experience in data mining At least 1 year of experience with one or more data/statistics tools, such as Python, R, SPSS, Perl At least 1 year of programming Demonstrated ability in effective communication skills
Fluent in English
Preferred Technical and Professional Experience 3 years of experience with one or more data/statistics tools such as Python, R, SPSS 1 year of experience with relational databases, such as DB2, NoSQL, etc

Sunday, 13 August 2017

Meet Nephele: Harness the Power of the Cloud for Your Microbiome Data Analysis

Nephele is a project from the National Institutes of Health (NIH) that brings together microbiome data and analysis tools in a cloud computing environment. It aims to address a major challenge facing researchers today — namely, analyzing, transferring, and storing biomedical "big data" — through the use of cloud-based resources

 Why Use Nephele?

  • Liberating: Nephele enables you to break free from constraints imposed on high-throughput computational analysis
  • Simple: Nephele is designed to be a no-hassle, easy-to-use tool to support your research
  • Sophisticated: Nephele is the most intuitive, advanced and secure microbiome analysis platform designed by our experienced computational biologists and software development team to provide exceptional capability with little effort on your part
  • Fast: Nephele speeds up microbiome data analysis and paves the path to getting to your results
  • Economical: Nephele's on-demand, pay-as-you-go setup offers a cost-effective alternative to using of dedicated resources for your microbiome data analysis
Ready to get started? Visit and enter your email address. Check your inbox for a message with the subject "Your Nephele Promotional Codes."
Stay in touch! Email with your questions and feedback. You can also visit our Google+ community page to connect with other researchers in the microbiome community (


demo bam file Ion Torrent 314 chip of E. coli 400 bp run for download

BAM file of B22-730 (314v2 E. coli 400 bp run)
Ion Torrent PGM 314v2 run with a mode read length of 400bp and per-base raw read accuracy >99%.


Wednesday, 2 August 2017

Creating filtered fastq files of ONLY mapped reads from a BAM file

Filtering BAM files for mapped or unmapped reads

To get the unmapped reads from a bam file use :
samtools view -f 4 file.bam > unmapped.sam, the output will be in sam
to get the output in bam use : samtools view -b -f 4 file.bam > unmapped.bam
To get only the mapped reads use the parameter 'F', which works like -v of grep and skips the alignments for a specific flag.
samtools view -b -F 4 file.bam > mapped.bam

Source: Sukhdeep Singh

To do this as efficiently as possible, using BBTools: in=reads.sam out=mapped.fq mappedonly
Also, BBMap has a lot of options designed for filtering, so it can output in fastq format and separate mapped from unmapped reads, preventing the creation of intermediate sam files.  This approach also keeps pairs together, which is not very easy using samtools for filtering. ref=reference.fa in=reads.fq outm=mapped.fq outu=unmapped.fq
Source: Brian Bushnell

Wednesday, 12 April 2017

Control a fleet of embedded unix systems (eg Raspberry Pi, Orange Pi) using saltstack

HAHAHA I share the same name as a software project. Bizarre discovery today
Control a fleet of embedded unix systems (eg Raspberry Pi, Orange Pi) using saltstack

Tuesday, 11 April 2017

github-based, community-maintained list of cancer clinical informatics resources

Sean Davis created a github-based, community-maintained list of cancer clinical informatics resources. 
"Contributions are welcome!"

For now, it's named as

Datanami, Woe be me