SlideShare uma empresa Scribd logo
1 de 31
From data to insights and
action: Strategies to take
your bioinformatics to the
next level
Eleanor Howe, Diamond Age Data Science
Huseyin Mehmet, Zafgen, Inc.
December 7, 2018
What is this talk about?
• Who are we? What is computational biology?
• Lessons learned from working with our customers
• Our ongoing relationship with Zafgen
• Q&A
Eleanor Howe, PhD
Background in molecular biology, statistics,
programming and computational
biology/bioinformatics
eleanor@diamondage.com
Diamond Age Data Science
www.diamondage.com
Bioinformatics/computational biology consulting
Project-based analysis
Staff augmentation
Pipeline development
“Drop-in” bioinformatics department
The Diamond Age: or,
A Young Lady’s Illustrated Primer
by Neal Stephenson
Team
Chris Friedline
Sequencing,
software engineering
Somdutta Saha
Computational chemistry and
proteomics
Bruce Romano
Mathematics and data science
Nicholas Crawford
Human genetics and GWAS
Mike DeRan
Cancer and diabetes
therapeutics, scRNA-seq
Max Marin
RNA splicing
Zarko Boskovic
Medicinal chemistry and
metabolomics
Chris Dwan
IT and data security
A few of our clients
Computational Biology
Computational biology is data
science for biology
Bioinformatics is sometimes a
synonym for computational
biology.
Other times, bioinformatics refers
to software engineering for
biology.
Lessons learned
Drug discovery requires evaluation of
diverse, complex data
• Sequence analysis is very different
from proteomics
• Knowing the landscape of available
datasets is key
• Individual bioinformaticians tend to
specialize in one sub-field or
another
Public datasets are a gold mine
• Cancer Cell-line Encyclopedia
• The Cancer Genome Atlas
• Gene Expression Omnibus
• Dependencies Map (Dep-map)
• UK Biobank
• DrugBank
• VarSome
• GTeX
But the real gems come from your own
experiments
It’s not possible to validate a drug
target using public datasets alone.
The public datasets are general, and
cover only the most common
diseases or disease subtypes.
The most useful results come from
combining custom-generated data
with public data.
CROs do the basics well
• Ocean Ridge, Novogene ($200 transcriptome!)
• Good for the basics - RNA-seq, DNA-seq, proteomics, metabolomics
• Reasonable standardized analysis pipelines
• Challenges:
• combining multiple datasets across experiments or across CROs
• more involved analysis (e.g. splicing)
• Do a thorough cost-comparison when considering an academic
collaborator
• Also ask them when their student is graduating.
What additional expertise do you need?
Early stage “traditional” therapeutics companies don’t need a full-time
computational biologist. Part time can work fine.
When the company expands, hire a computational biologist with
substantial experience, or an analyst with some kind of advisor
available.
Computational biologist:
Experience/training in all three
areas
Analyst: Biology + programming,
with an advisor to help with the
statistics
Methods developer: Wants to
build new analytical tools
Know what you need
What expertise do you need?
For Teams:
• Cross-discipline expertise
-biology, chemistry, computer science, statistics
• Communication skills
• Lateral thinking
Expertise gets you fast answers
The problem:
Get a terabyte of data from a USB
hard drive to the cloud in time to
analyze a dataset for a conference
Expertise gets you fast answers
The problem:
Get a terabyte of data from a USB
hard drive to the cloud in time to
analyze a dataset for a conference
The solution:
Bicycle across the Charles
3Gb/s bicycle (latency of 1.2M
ms)
Datacenter internet connection
Markley Data Center
Deep Learning / Artificial Intelligence
Another danger zone
Deep Learning / Artificial Intelligence
Deep learning is “new” in
that it’s a more complex
version of older
technology: a neural
network
Modern compute power
allows for powerful
classifiers trained on very
large datasets
The basics of machine learning (and DL)
Deep Learning works in a
similar way to other types
of machine learning.
The algorithms use larger
datasets and are more
complex. But the overall
workflow is the same.
Should you use deep learning?
Is your training data:
Large. 100,000+ to 1M+
samples
Well-annotated. Gene
expression data usually isn’t.
Representative of the
questions you want to answer?
In discovery biology, the data is
usually not there. Hence “discovery”.
Good use-cases for deep learning
Image processing
Diagnostics from histology,
radiology
High-content screening
Biochemical structure/sequence
Epitope prediction
Protein folding (Deep Mind)
Single-cell RNA-seq (potentially)
Should you use deep learning? (cont)
Do you need an interpretable model?
Deep learning is a black box
Have you tried everything else?
Linear models, random
forests, other ML techniques
These tools are often faster, cheaper,
and easier to understand and
implement
Huseyin Mehmet, PhD
Vice President and Head of Discovery Research
Zafgen, Inc.
Zafgen, Inc
• Publicly traded bio-pharmaceutical company
• Founded 12 years ago (IPO in 2014)
• Virtual company
• Bringing MetAP2 inhibitors to market
• Areas of interest: Metabolic disease
Zafgen and Diamond Age
Diamond Age acts as a virtual bioinformatics
department for Zafgen
• Data Analysis
• Data Management
• Hypothesis generation
• Technology recommendations
What Diamond Age has done for Zafgen
• Transcriptional profiling
• Proteomics/phosphoproteomics
• Metabolomics
• Clinical outcomes
• Custom apps for client needs
The benefits
What can Zafgen can do now that it couldn’t before?
• Iterative data generation
• Cross-dataset analyses
• Confidence in analysis results from CROs
• Link between pre-clinical and clinical data
• Cost efficiencies / value for money
Thank you!
Questions?
Using Bioinformatics Data to inform Therapeutics discovery and development

Mais conteúdo relacionado

Mais procurados

The current state of prediction in neuroimaging
The current state of prediction in neuroimagingThe current state of prediction in neuroimaging
The current state of prediction in neuroimagingSaigeRutherford
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptbutest
 
Jillian ms defense-4-14-14-ja
Jillian ms defense-4-14-14-jaJillian ms defense-4-14-14-ja
Jillian ms defense-4-14-14-jaJillian Aurisano
 
Towards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imagingTowards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imagingOla Spjuth
 
MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?Al Dossetter
 
Practical Drug Discovery using Explainable Artificial Intelligence
Practical Drug Discovery using Explainable Artificial IntelligencePractical Drug Discovery using Explainable Artificial Intelligence
Practical Drug Discovery using Explainable Artificial IntelligenceAl Dossetter
 
Accelerating multiple medicinal chemistry projects using Artificial Intellige...
Accelerating multiple medicinal chemistry projects using Artificial Intellige...Accelerating multiple medicinal chemistry projects using Artificial Intellige...
Accelerating multiple medicinal chemistry projects using Artificial Intellige...Al Dossetter
 
The ELIXIR UK industry survey by Gabriella Rustici
The ELIXIR UK industry survey by Gabriella RusticiThe ELIXIR UK industry survey by Gabriella Rustici
The ELIXIR UK industry survey by Gabriella RusticiELIXIR UK
 
Biological Foundations for Deep Learning: Towards Decision Networks
 Biological Foundations for Deep Learning: Towards Decision Networks Biological Foundations for Deep Learning: Towards Decision Networks
Biological Foundations for Deep Learning: Towards Decision Networksdiannepatricia
 
Zebrafish and Data Management Final Project
Zebrafish and Data Management Final ProjectZebrafish and Data Management Final Project
Zebrafish and Data Management Final ProjectJulie Goldman
 
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...Educating the Scientific Brain and Mind: Insights from The Science of Learnin...
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...CITE
 
Analogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thoughtAnalogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thoughtCITE
 
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...Publish or Perish: Questioning the Impact of Our Research on the Software Dev...
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...Margaret-Anne Storey
 
MedChemica Active Learning - Combining MMPA and ML
MedChemica Active Learning - Combining MMPA and MLMedChemica Active Learning - Combining MMPA and ML
MedChemica Active Learning - Combining MMPA and MLAl Dossetter
 
Validating microbiome claims – including the latest DNA techniques
Validating microbiome claims – including the latest DNA techniquesValidating microbiome claims – including the latest DNA techniques
Validating microbiome claims – including the latest DNA techniquesEagle Genomics
 

Mais procurados (19)

In Silico Approaches for Predicting Hazards from Chemical Structure and Exist...
In Silico Approaches for Predicting Hazards from Chemical Structure and Exist...In Silico Approaches for Predicting Hazards from Chemical Structure and Exist...
In Silico Approaches for Predicting Hazards from Chemical Structure and Exist...
 
The current state of prediction in neuroimaging
The current state of prediction in neuroimagingThe current state of prediction in neuroimaging
The current state of prediction in neuroimaging
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.ppt
 
Jillian ms defense-4-14-14-ja
Jillian ms defense-4-14-14-jaJillian ms defense-4-14-14-ja
Jillian ms defense-4-14-14-ja
 
Towards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imagingTowards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imaging
 
MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?
 
Practical Drug Discovery using Explainable Artificial Intelligence
Practical Drug Discovery using Explainable Artificial IntelligencePractical Drug Discovery using Explainable Artificial Intelligence
Practical Drug Discovery using Explainable Artificial Intelligence
 
My experiment
My experimentMy experiment
My experiment
 
Accelerating multiple medicinal chemistry projects using Artificial Intellige...
Accelerating multiple medicinal chemistry projects using Artificial Intellige...Accelerating multiple medicinal chemistry projects using Artificial Intellige...
Accelerating multiple medicinal chemistry projects using Artificial Intellige...
 
The ELIXIR UK industry survey by Gabriella Rustici
The ELIXIR UK industry survey by Gabriella RusticiThe ELIXIR UK industry survey by Gabriella Rustici
The ELIXIR UK industry survey by Gabriella Rustici
 
Watson Computer
Watson ComputerWatson Computer
Watson Computer
 
Biological Foundations for Deep Learning: Towards Decision Networks
 Biological Foundations for Deep Learning: Towards Decision Networks Biological Foundations for Deep Learning: Towards Decision Networks
Biological Foundations for Deep Learning: Towards Decision Networks
 
Zebrafish and Data Management Final Project
Zebrafish and Data Management Final ProjectZebrafish and Data Management Final Project
Zebrafish and Data Management Final Project
 
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...Educating the Scientific Brain and Mind: Insights from The Science of Learnin...
Educating the Scientific Brain and Mind: Insights from The Science of Learnin...
 
Analogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thoughtAnalogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thought
 
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...Publish or Perish: Questioning the Impact of Our Research on the Software Dev...
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...
 
MedChemica Active Learning - Combining MMPA and ML
MedChemica Active Learning - Combining MMPA and MLMedChemica Active Learning - Combining MMPA and ML
MedChemica Active Learning - Combining MMPA and ML
 
Validating microbiome claims – including the latest DNA techniques
Validating microbiome claims – including the latest DNA techniquesValidating microbiome claims – including the latest DNA techniques
Validating microbiome claims – including the latest DNA techniques
 
Software Testing
Software TestingSoftware Testing
Software Testing
 

Semelhante a Using Bioinformatics Data to inform Therapeutics discovery and development

Data Science Master Specialisation
Data Science Master SpecialisationData Science Master Specialisation
Data Science Master SpecialisationArjen de Vries
 
Current and future challenges in data science
Current and future challenges in data scienceCurrent and future challenges in data science
Current and future challenges in data scienceNathaniel Shimoni
 
Genome sharing projects around the world nijmegen oct 29 - 2015
Genome sharing projects around the world   nijmegen oct 29 - 2015Genome sharing projects around the world   nijmegen oct 29 - 2015
Genome sharing projects around the world nijmegen oct 29 - 2015Fiona Nielsen
 
Big Data & ML for Clinical Data
Big Data & ML for Clinical DataBig Data & ML for Clinical Data
Big Data & ML for Clinical DataPaul Agapow
 
Qualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slidesQualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slidesAlexandra Howson MA, PhD, CHCP
 
Qualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slidesQualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slidesAlexandra Howson MA, PhD, CHCP
 
Geoff what is_medical_informatics_oct2012
Geoff what is_medical_informatics_oct2012Geoff what is_medical_informatics_oct2012
Geoff what is_medical_informatics_oct2012Geoffrey Rutledge
 
The Simulacrum, a Synthetic Cancer Dataset
The Simulacrum, a Synthetic Cancer DatasetThe Simulacrum, a Synthetic Cancer Dataset
The Simulacrum, a Synthetic Cancer DatasetCongChen35
 
2016 09 cxo forum
2016 09 cxo forum2016 09 cxo forum
2016 09 cxo forumChris Dwan
 
Melissa Informatics - Data Quality and AI
Melissa Informatics - Data Quality and AIMelissa Informatics - Data Quality and AI
Melissa Informatics - Data Quality and AImelissadata
 
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...ryanorban
 
Artificial Intelligence for Medicine
Artificial Intelligence for MedicineArtificial Intelligence for Medicine
Artificial Intelligence for MedicineTassilo Klein
 
Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data scienceJordan Engbers
 
Career oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of BioinformaticsCareer oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of BioinformaticsShikha Thakur
 
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirShare and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirSpark Summit
 

Semelhante a Using Bioinformatics Data to inform Therapeutics discovery and development (20)

Data Science Master Specialisation
Data Science Master SpecialisationData Science Master Specialisation
Data Science Master Specialisation
 
Current and future challenges in data science
Current and future challenges in data scienceCurrent and future challenges in data science
Current and future challenges in data science
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 
Genome sharing projects around the world nijmegen oct 29 - 2015
Genome sharing projects around the world   nijmegen oct 29 - 2015Genome sharing projects around the world   nijmegen oct 29 - 2015
Genome sharing projects around the world nijmegen oct 29 - 2015
 
Big Data & ML for Clinical Data
Big Data & ML for Clinical DataBig Data & ML for Clinical Data
Big Data & ML for Clinical Data
 
Qualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slidesQualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slides
 
Qualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slidesQualitative analysis boot camp final presentation slides
Qualitative analysis boot camp final presentation slides
 
Geoff what is_medical_informatics_oct2012
Geoff what is_medical_informatics_oct2012Geoff what is_medical_informatics_oct2012
Geoff what is_medical_informatics_oct2012
 
2014 aus-agta
2014 aus-agta2014 aus-agta
2014 aus-agta
 
The Simulacrum, a Synthetic Cancer Dataset
The Simulacrum, a Synthetic Cancer DatasetThe Simulacrum, a Synthetic Cancer Dataset
The Simulacrum, a Synthetic Cancer Dataset
 
2016 09 cxo forum
2016 09 cxo forum2016 09 cxo forum
2016 09 cxo forum
 
Melissa Informatics - Data Quality and AI
Melissa Informatics - Data Quality and AIMelissa Informatics - Data Quality and AI
Melissa Informatics - Data Quality and AI
 
2015 04-18-wilson cg
2015 04-18-wilson cg2015 04-18-wilson cg
2015 04-18-wilson cg
 
Big Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARLBig Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARL
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
 
Artificial Intelligence for Medicine
Artificial Intelligence for MedicineArtificial Intelligence for Medicine
Artificial Intelligence for Medicine
 
Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data science
 
Career oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of BioinformaticsCareer oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of Bioinformatics
 
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirShare and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
 

Último

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 

Último (20)

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 

Using Bioinformatics Data to inform Therapeutics discovery and development

  • 1. From data to insights and action: Strategies to take your bioinformatics to the next level Eleanor Howe, Diamond Age Data Science Huseyin Mehmet, Zafgen, Inc. December 7, 2018
  • 2. What is this talk about? • Who are we? What is computational biology? • Lessons learned from working with our customers • Our ongoing relationship with Zafgen • Q&A
  • 3. Eleanor Howe, PhD Background in molecular biology, statistics, programming and computational biology/bioinformatics eleanor@diamondage.com
  • 4. Diamond Age Data Science www.diamondage.com Bioinformatics/computational biology consulting Project-based analysis Staff augmentation Pipeline development “Drop-in” bioinformatics department The Diamond Age: or, A Young Lady’s Illustrated Primer by Neal Stephenson
  • 5. Team Chris Friedline Sequencing, software engineering Somdutta Saha Computational chemistry and proteomics Bruce Romano Mathematics and data science Nicholas Crawford Human genetics and GWAS Mike DeRan Cancer and diabetes therapeutics, scRNA-seq Max Marin RNA splicing Zarko Boskovic Medicinal chemistry and metabolomics Chris Dwan IT and data security
  • 6. A few of our clients
  • 7. Computational Biology Computational biology is data science for biology Bioinformatics is sometimes a synonym for computational biology. Other times, bioinformatics refers to software engineering for biology.
  • 9. Drug discovery requires evaluation of diverse, complex data • Sequence analysis is very different from proteomics • Knowing the landscape of available datasets is key • Individual bioinformaticians tend to specialize in one sub-field or another
  • 10. Public datasets are a gold mine • Cancer Cell-line Encyclopedia • The Cancer Genome Atlas • Gene Expression Omnibus • Dependencies Map (Dep-map) • UK Biobank • DrugBank • VarSome • GTeX
  • 11. But the real gems come from your own experiments It’s not possible to validate a drug target using public datasets alone. The public datasets are general, and cover only the most common diseases or disease subtypes. The most useful results come from combining custom-generated data with public data.
  • 12. CROs do the basics well • Ocean Ridge, Novogene ($200 transcriptome!) • Good for the basics - RNA-seq, DNA-seq, proteomics, metabolomics • Reasonable standardized analysis pipelines • Challenges: • combining multiple datasets across experiments or across CROs • more involved analysis (e.g. splicing) • Do a thorough cost-comparison when considering an academic collaborator • Also ask them when their student is graduating.
  • 13. What additional expertise do you need? Early stage “traditional” therapeutics companies don’t need a full-time computational biologist. Part time can work fine. When the company expands, hire a computational biologist with substantial experience, or an analyst with some kind of advisor available.
  • 14. Computational biologist: Experience/training in all three areas Analyst: Biology + programming, with an advisor to help with the statistics Methods developer: Wants to build new analytical tools Know what you need
  • 15. What expertise do you need? For Teams: • Cross-discipline expertise -biology, chemistry, computer science, statistics • Communication skills • Lateral thinking
  • 16. Expertise gets you fast answers The problem: Get a terabyte of data from a USB hard drive to the cloud in time to analyze a dataset for a conference
  • 17. Expertise gets you fast answers The problem: Get a terabyte of data from a USB hard drive to the cloud in time to analyze a dataset for a conference The solution: Bicycle across the Charles 3Gb/s bicycle (latency of 1.2M ms) Datacenter internet connection Markley Data Center
  • 18. Deep Learning / Artificial Intelligence Another danger zone
  • 19. Deep Learning / Artificial Intelligence Deep learning is “new” in that it’s a more complex version of older technology: a neural network Modern compute power allows for powerful classifiers trained on very large datasets
  • 20. The basics of machine learning (and DL) Deep Learning works in a similar way to other types of machine learning. The algorithms use larger datasets and are more complex. But the overall workflow is the same.
  • 21. Should you use deep learning? Is your training data: Large. 100,000+ to 1M+ samples Well-annotated. Gene expression data usually isn’t. Representative of the questions you want to answer? In discovery biology, the data is usually not there. Hence “discovery”.
  • 22. Good use-cases for deep learning Image processing Diagnostics from histology, radiology High-content screening Biochemical structure/sequence Epitope prediction Protein folding (Deep Mind) Single-cell RNA-seq (potentially)
  • 23. Should you use deep learning? (cont) Do you need an interpretable model? Deep learning is a black box Have you tried everything else? Linear models, random forests, other ML techniques These tools are often faster, cheaper, and easier to understand and implement
  • 24.
  • 25. Huseyin Mehmet, PhD Vice President and Head of Discovery Research Zafgen, Inc.
  • 26. Zafgen, Inc • Publicly traded bio-pharmaceutical company • Founded 12 years ago (IPO in 2014) • Virtual company • Bringing MetAP2 inhibitors to market • Areas of interest: Metabolic disease
  • 27. Zafgen and Diamond Age Diamond Age acts as a virtual bioinformatics department for Zafgen • Data Analysis • Data Management • Hypothesis generation • Technology recommendations
  • 28. What Diamond Age has done for Zafgen • Transcriptional profiling • Proteomics/phosphoproteomics • Metabolomics • Clinical outcomes • Custom apps for client needs
  • 29. The benefits What can Zafgen can do now that it couldn’t before? • Iterative data generation • Cross-dataset analyses • Confidence in analysis results from CROs • Link between pre-clinical and clinical data • Cost efficiencies / value for money