Kevin's GATTACA World: 2017

Friday, 8 December 2017

Job: DATA SCIENTIST--WATSON HEALTH-CAMBRIDGE, MA.

15 months ago by

Spotted this ad in Biostars .. https://www.biostars.org/p/210796/

The IBM Watson Health business division is now looking for talented individuals destined to usher in the next era of healthcare. We live in a moment of remarkable change and opportunity. The convergence of data and technology is transforming healthcare and life sciences organizations in every way. New roles are being created that never existed before to meet the demands of this transformation.

Link: http://ibm.biz/BdruNd

We are now looking for a Genomic Data Scientist to join our team.

You will have an opportunity to work directly with the team building new healthcare solutions using genomic analytics and serving oncologists, pathologists and other specialists caring for patients. You will help define, design, and build those solutions and apply your expertise to work in different analytical and statistical models.

Key Responsibilities: Develop tools to transform load and validate data Strategizes new uses for data and its interaction with data design Perform data studies of new and diverse data sources Find new uses for existing data sources Discover “stories” told by the data and presents them to other scientists and business managers Generate algorithms and create computer models**

Ideal Candidates will possess the following:

Candidates should foremost have a strong background in data mining and statistics. Hands-on background in programming and using databases and tools to mine data including practical experience in extracting, transforming and load data as well as developing statistical and analytical models. Candidates must have demonstrated capacity to adapt to demanding and high pressure projects and adaptability to client’s needs. Background on bioinformatics Experience in Healthcare or Life Sciences

Learn more about IBM Watson Health and what we are doing …. And apply Now to explore this opportunity with us!

*U.S. Department of Veterans Affairs Enlists IBM’s Watson in the War on Cancer Public-Private Partnership Will Help Doctors Scale Precision Medicine Access for up to 10,000 VA Cancer Patients http://www-03.ibm.com/press/us/en/pressrelease/50061.wss

IBM and New York Genome Center’s new cancer tumor repository aims to revolutionize treatment

IBM's Watson to help doctors devise optimal cancer treatment

Employment Type

Full-Time

Required Technical and Professional Expertise At least 2 years of experience in data mining At least 1 year of experience with one or more data/statistics tools, such as Python, R, SPSS, Perl At least 1 year of programming Demonstrated ability in effective communication skills

Fluent in English

Preferred Technical and Professional Experience 3 years of experience with one or more data/statistics tools such as Python, R, SPSS 1 year of experience with relational databases, such as DB2, NoSQL, etc

Sunday, 13 August 2017

Meet Nephele: Harness the Power of the Cloud for Your Microbiome Data Analysis

Nephele is a project from the National Institutes of Health (NIH) that brings together microbiome data and analysis tools in a cloud computing environment. It aims to address a major challenge facing researchers today — namely, analyzing, transferring, and storing biomedical "big data" — through the use of cloud-based resources

Why Use Nephele?

Liberating: Nephele enables you to break free from constraints imposed on high-throughput computational analysis
Simple: Nephele is designed to be a no-hassle, easy-to-use tool to support your research
Sophisticated: Nephele is the most intuitive, advanced and secure microbiome analysis platform designed by our experienced computational biologists and software development team to provide exceptional capability with little effort on your part
Fast: Nephele speeds up microbiome data analysis and paves the path to getting to your results
Economical: Nephele's on-demand, pay-as-you-go setup offers a cost-effective alternative to using of dedicated resources for your microbiome data analysis

Ready to get started? Visit https://nephele.niaid.nih.gov/ and enter your email address. Check your inbox for a message with the subject "Your Nephele Promotional Codes."

Stay in touch! Email nephele@mail.nih.gov with your questions and feedback. You can also visit our Google+ community page to connect with other researchers in the microbiome community (https://plus.google.com/communities/107278901311674483366).

Source: https://www.biostars.org/p/204081/

demo bam file Ion Torrent 314 chip of E. coli 400 bp run for download

BAM file of B22-730 (314v2 E. coli 400 bp run)
Ion Torrent PGM 314v2 run with a mode read length of 400bp and per-base raw read accuracy >99%.

https://s3.amazonaws.com/ion-torrent/pgm/B22-730/B22-730.bam

Source: https://apps.thermofisher.com/apps/publiclib/#/datasets

Wednesday, 2 August 2017

Creating filtered fastq files of ONLY mapped reads from a BAM file

Filtering BAM files for mapped or unmapped reads

To get the unmapped reads from a bam file use :

samtools view -f 4 file.bam > unmapped.sam, the output will be in sam

to get the output in bam use : samtools view -b -f 4 file.bam > unmapped.bam

To get only the mapped reads use the parameter 'F', which works like -v of grep and skips the alignments for a specific flag.

samtools view -b -F 4 file.bam > mapped.bam

Source: https://www.biostars.org/p/56246/ Sukhdeep Singh

To do this as efficiently as possible, using BBTools:

reformat.sh in=reads.sam out=mapped.fq mappedonly

Also, BBMap has a lot of options designed for filtering, so it can output in fastq format and separate mapped from unmapped reads, preventing the creation of intermediate sam files. This approach also keeps pairs together, which is not very easy using samtools for filtering.

bbmap.sh ref=reference.fa in=reads.fq outm=mapped.fq outu=unmapped.fq

Source: https://www.biostars.org/p/127992/ Brian Bushnell

Wednesday, 12 April 2017

Control a fleet of embedded unix systems (eg Raspberry Pi, Orange Pi) using saltstack

HAHAHA I share the same name as a software project. Bizarre discovery today

https://github.com/unixbigot/kevin
Control a fleet of embedded unix systems (eg Raspberry Pi, Orange Pi) using saltstack

Tuesday, 11 April 2017

github-based, community-maintained list of cancer clinical informatics resources

Sean Davis created a github-based, community-maintained list of cancer clinical informatics resources.
"Contributions are welcome!" https://lnkd.in/d-uphUc

For now, it's named as

ci4cc-informatics-resources

https://github.com/seandavi/ci4cc-informatics-resources

Tuesday, 7 February 2017

offline plotly Gantt plots using Python/pandas

modified from https://plot.ly/python/gantt/#use-a-pandas-dataframe to do offline and outside of ipython

Kevin's GATTACA World