Computational Biology Consultant @ Self Employed Developed probabilistic models of cell free dna in the context of non-invasive prenatal screening From February 2019 to August 2019 (7 months) Computational Biologist @ Freenome • Developed a CNV caller based on negative binomial regression, a hidden markov model, and expectation maximization• Developed Machine Learning algorithms: SVD based batch normalization,...
Computational Biology Consultant @ Self Employed Developed probabilistic models of cell free dna in the context of non-invasive prenatal screening From February 2019 to August 2019 (7 months) Computational Biologist @ Freenome • Developed a CNV caller based on negative binomial regression, a hidden markov model, and expectation maximization• Developed Machine Learning algorithms: SVD based batch normalization, deep learning neural networks (CNNs, autoencoders, variational autoencoders, transfer learning)• Utilized recent advances in machine learning libraries to create a novel form of negative binomial regression to correct for batch effects with specialized loss function to account for GC bias correction• Implemented Probabilistic Graphical Models used for quantifying sample-to-sample contamination and for finding duplicate samples from low coverage data• Owned, enhanced and maintained NGS sequencing pipeline which processed petabyte scale data on google cloud. Included a complete rewrite, daily monitoring of jobs and adding new features such as UMI dedupping, QC metrics, adapter trimming, logging, job database, devops, etc.• Lead multiple projects around Quality Control• Mentored teammates around contributing to the NGS pipeline software• Ad hoc data science/data analysis From October 2017 to October 2018 (1 year 1 month) San Francisco Bay AreaScientific Advisor @ NGX Bio Assisted in the deployment and development of an open source software suite which I wrote for whole genome and exome analysis on AWS, which utilizes spot instance clusters to run the GATK best practices on large NGS datasets.Provide short and long-term scientific and software architecture advice. From October 2015 to May 2016 (8 months) San FranciscoSenior Research Associate, Lab for Personalized Medicine @ Harvard Medical School Bioinformatics Programmer in the Lab for Personalized Medicine at the Center for Biomedical Informatics.Worked with various research groups and clinicians, developed several software applications for NGS analysis and cloud computing, taught practicum classes on NGS analysis, developed a NGS sequencing pipeline that performed GATK best practices on whole genome data on AWS using spot instances. Among other things, developed a variant annotation pipeline, a basic LIMs, a SQL variant database and a genetics research annotation database. From September 2010 to May 2013 (2 years 9 months) Senior Research Associate @ Beth Israel Deaconess Medical Center Worked with clinicians to identify genomic classifications of responders/non-responders in a Phase III clinical trial.Worked with one of the first academic groups to do clinical exome sequencing and interpretation doing bioinformatics pipeline and engineering support From 2012 to 2013 (1 year) Greater Boston AreaSenior Bioinformatian @ Invitae • Developed and productionized a patented bioinformatic algorithm and assay for detecting pathogenic mutations in PMS2 and other genes which contain a highly paralogous pseudogene. This gave Invitae’s hereditary cancer offering an important edge over the competition.• Made many improvements to the sensitivity/specificity of Invitae’s clinical variant calling pipeline. Authored and clinically validated several variant calling pipelines. Authored the open source python library that Invitae pipelines are written in, Cosmos.• Designed and developed a patent-pending low cost NGS cytogenetics assay capable of detecting CNVs>250kb and AOH>3MB across the entire genome. Included design of hybridization capture probe sets, proof of concept, experimental design, and enhancements to internal CNV calling algorithm, probability models, and software.• Developed and productionized a Bayesian maximum likelihood based contamination estimator for NGS data, which has been vital for QC.• Helped develop a Bayesian pedigree checker: given a multi-sample vcf it determines all pairwise relationships (parent/chid, sibling, cousin, etc) and their associated probability. • Helped with proof of concept work and wrote and productionized a software variant calling pipeline for an MLPA and NGS based copy number detection and confirmation assay. • Developed and productionized an Indel Realignment (similar to GATK’s) algorithm.• Developed and productionized a reference confidence metric for auto-closing gaps.• Developed a CGH vs NGS fingerprinting tool using Bayesian Probabilistic Modeling, which had high sensitivity/specificity even with very limited data.• Created an alignment visualization REST service used for visual clinical triaging of data.• Created a software pipeline for spelling-agnostic VCF comparison used for routine pipeline and assay validation.• Bioinformatics support for CLIA validation of our primary assay. From June 2013 to October 2017 (4 years 5 months) San FranciscoCo-Founder, Machine Learning Scientist @ Ravel Biotechnology Developing cell free DNA based technologies for the early detection of disease. San Francisco Bay Area
Computational Biology Consultant
February 2019 to August 2019
October 2017 to October 2018
San Francisco Bay Area
October 2015 to May 2016
Harvard Medical School
Senior Research Associate, Lab for Personalized Medicine
September 2010 to May 2013
Beth Israel Deaconess Medical Center
Senior Research Associate
2012 to 2013
Greater Boston Area
June 2013 to October 2017
Co-Founder, Machine Learning Scientist
San Francisco Bay Area
What company does Erik Gafni work for?
Erik Gafni works for Self Employed
What is Erik Gafni's role at Self Employed?
Erik Gafni is Computational Biology Consultant
What industry does Erik Gafni work in?
Erik Gafni works in the Biotechnology industry.
Enjoy unlimited access and discover candidates outside of LinkedIn
One billion email addresses and counting
Everything you need to engage with more prospects.
ContactOut is used by
76% of Fortune 500 companies