SlideShare uma empresa Scribd logo
1 de 41
Data Science in Drug
Discovery
Marina Sirota, PhD
Assistant Professor
Institute for Computational Health
Sciences
October 22, 2015
Data Driven Research
Integrative Personal “Omics”
Profiling
Genome
Transcriptome
Epigenome
MicrobiomeProteome / Metabolome
Antibodyome
Moore’s Law – Biology and
Computation
Cost Per Genome
Cost of Computational Resources
Can we use data integration
to…
Biomarker
Discovery
… to find
better
diagnostic
markers?
Disease
Mechanism
… understand
disease
better?
Therapeutics
… find new
uses for
existing
drugs?
Motivation
• Problem:
– Takes roughly 15 years and over
$800 million to develop and bring
a novel drug to market
– 90% of drugs fail in early
development
• Solution: Drug Repurposing
– Lower cost
– Reduce risk of failure
Problem Statement
Can we use public data to
systematically predict relationships
between drugs and diseases?
Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A, Sage J, Butte AJ. Discovery and Validation of
Drug Indications Using Compendia of Public Gene Expression Data. Science Translational Medicine. Aug 2011.
Problem Statement
Can we use public data to
systematically predict relationships
between drugs and diseases?
Diseases
Drugs
Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A, Sage J, Butte AJ. Discovery and Validation of
Drug Indications Using Compendia of Public Gene Expression Data. Science Translational Medicine. Aug 2011.
What is Gene Expression
Profiling?
• Global snapshot of cellular function and
activity
– Genome sequence – what might be going on
– Expression – what is actually going on
• 25,000 genes 1,000,000 proteins
• We can measure a few thousand proteins,
but gene expression is a global proxy
How Can We Measure Expression?
Microarrays
• Thousands of probes are hybridized to a solid
surface
• Takes advantage of complementary DNA
sequences
• Process:
– RNA is extracted from the sample
– Fluorescent labeling
– Hybridization and wash
– Scanning and signal processing
– Normalization and analysis!
Data Sources
• Collection of expression
data from cultured human
cells
• 453 experiments of 164
drugs
• Covers broad range of
effects
– FDA approved drugs
– Non drug bioactive small
molecules
• Publicly available
gene expression
repository
– Platforms – 11,745
– Samples – 961,202
– Series -39,679
• There are numerous
experiments dealing
with over 200
diseasesBarrett et al. NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res. 2009.
Lamb et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease.
Science. 2006.
Disease Gene Expression Data
(GEO)
Butte AJ, Chen R. AMIA,
2006.
Download all GDS
Experiments GEO
Identify Disease
Associated Experiments
Identify Normal vs.
Disease Experiments
176 datasets, 3113
arrays, 100 diseases
Dudley J, Butte AJ. PSB,
2008.
Dudley JT, Tibshirani R, Deshpande T, Butte AJ. Disease signatures are robust across tissues and experiments. Mol
Disease Gene Expression
Signature
Disease
Individuals
Healthy
Controls
Disease Gene Expression
Signature
Drug Gene Expression Profile
Treated
Sample
Untreated
Sample
Drug Gene Expression Profile
Up-regulated Down-regulated
Hypothesis
Gene Expression Profiles
Disease Drug BDisease Drug A Disease Drug C
Genes
Genes
Genes
Treatment Adverse Reaction
?
????
Lamb et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease.
Computational Pipeline
Disease Gene Expression Signature
Genes
Drugs
Disease-Drug Scores
Drugs Similar
to Disease
Drugs Opposite
to Disease
Drug-Disease Relationships
Drugs
Diseases
Positive Correlation – Adverse Reaction? Negative Correlation - Therapeutic
Similar Diseases Cluster
Based on Disease-Drug Similarity
Families of Drugs Cluster
Based on Disease-Drug Similarity
Families of Drugs Cluster
Based on Disease-Drug Similarity
Drug-Disease Relationships
Drugs
Diseases
Positive Correlation – Adverse Reaction? Negative Correlation - Therapeutic
Crohn’s Disease
• An inflammatory disease of the intestines
that has an autoimmune component
• Affects 500,000 people in North America
• No known pharmaceutical cure
• Current solutions:
– Reduce inflammation with anti-
inflammatory drugs and
corticosteroids (prednisone)
– Bad side effects
– Surgical solutions
Therapeutic Predictions for
Crohn’s Disease
Therapeutic Predictions for
Crohn’s Disease
Topiramate – An Anti-Seizure
Drug
• Suppresses the rapid and excessive
firing of neurons that start a seizure
• Enhances GABA-activation
• Used to treat epilepsy, bipolar disorder
• Antidepressant
• Investigated as potential
treatment for obesity and
type II diabetes
Topiramate and Crohn’s
Genes that are
up-regulated by the drug are
down-regulated in the disease
Genes that are
down-regulated by the drug ar
up-regulated in the disease
Animal Model for Crohn’s
• TNBS (trinitrobenzene sulfonic acid) +
ethanol induced rats:
– Excellent and reproducible experimental model
for Inflammatory Bowel Disease (Crohn’s and
Ulcerative Colitis)
– Toxin-based model
Normal TNBS Induced
Pilot Validation Study Design
• Pilot Study – 18 rats
– Healthy (control)
– TNBS-Induced Untreated
– TNBS-Induced Treated
• 80 mg/kg topiramate, injected daily
• Colon tissue macroscopic damage score
Reetesh Pai, Mohan Shenoy and Pankaj Jay
Validation Results
Two Follow-up Validation
Studies
• 48 rats each – 4 groups of 12 rats
– Healthy Controls
– TNBS + Vehicle
– TNBS + Prednisolone
– TNBS + Topiramate
• 7 days
• Clinical Signs, Pathology Score, Histology
• Endoscopy Images
Clinical Signs
Pathology Scores
B
S+Veh
B
S+Pred
B
S+Top
Vehicle
0
1
2
3
4
5
GrossPathologyScore
****
B
Histology
Endoscopy
Drug-Disease Signature
Ongoing work
• Extending the drug datasets to use structural
data
• Incorporating meta-analysis methods
• Application to cancer (lung cancer, liver
cancer, medulloblastoma)
• More focused cell line selection
• Looking at dosage response and combination
therapy prediction
• Leveraging EMR and clinical trial dataChen B, Sirota M, Fan-Minogue H, Hadley D, Butte AJ. Relating Hepatocellular Carcinoma Tumor Samples and Cell
Lines Using Gene Expression Data in Translational Research. BMC Medical Genomics, 2015.
Wu M, Sirota M, Butte AJ, Chen B. Characteristics of drug combination therapy in oncology by analyzing clinical trial
data on clinicaltrials.gov. Pac Symp Biocomput. 2015.
Can we use data integration
to…
Biomarker
Discovery
… to find
better
diagnostic
markers?
Disease
Mechanism
… understand
disease
better?
Therapeutics
… find new
uses for
existing
drugs?
Precision Medicine
responders
non-responders
test
Acknowledgements
Atul Butte
Joel Dudley Annie P. Chiang
Alex Morgan Pankaj Jay Pasricha
Mohan Shenoy Minnie Sarwal
Reetesh Pai Julien Sage
Silke Roedder Alejandro Sweet-
Cordero
Bin Chen Hanna Paik
Dexter Hadley
Institute for Computational Health
Sciences @ UCSF
Thanks!
marina.sirota@ucsf.edu

Mais conteúdo relacionado

Destaque

Python, Pharmaceuticals, and Drug Discovery by Emlyn Clay
Python, Pharmaceuticals, and Drug Discovery by Emlyn ClayPython, Pharmaceuticals, and Drug Discovery by Emlyn Clay
Python, Pharmaceuticals, and Drug Discovery by Emlyn ClayPyData
 
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2QIAGEN
 
Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor
Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor
Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor Max Tucker
 
Mistakes I've Made- Cam Davidson-Pilon
Mistakes I've Made- Cam Davidson-PilonMistakes I've Made- Cam Davidson-Pilon
Mistakes I've Made- Cam Davidson-PilonPyData
 
Sensor Data Wrangling: From Metal to Cloud
Sensor Data Wrangling: From Metal to CloudSensor Data Wrangling: From Metal to Cloud
Sensor Data Wrangling: From Metal to CloudWrangleConf
 
Condense Fact from the Vapor of Nuance
Condense Fact from the Vapor of Nuance Condense Fact from the Vapor of Nuance
Condense Fact from the Vapor of Nuance WrangleConf
 
Wrangle 2016: Data Science for HR
Wrangle 2016: Data Science for HRWrangle 2016: Data Science for HR
Wrangle 2016: Data Science for HRWrangleConf
 
Wrangle 2016: Malware Tracking at Scale
Wrangle 2016: Malware Tracking at ScaleWrangle 2016: Malware Tracking at Scale
Wrangle 2016: Malware Tracking at ScaleWrangleConf
 
Wrangle 2016: Driving Healthcare Operations with Small Data
Wrangle 2016: Driving Healthcare Operations with Small DataWrangle 2016: Driving Healthcare Operations with Small Data
Wrangle 2016: Driving Healthcare Operations with Small DataWrangleConf
 
From Science to Product (Company)
From Science to Product (Company)From Science to Product (Company)
From Science to Product (Company)WrangleConf
 
Wrangle 2016 - Digital Vulnerability: Characterizing Risks and Contemplating ...
Wrangle 2016 - Digital Vulnerability: Characterizing Risks and Contemplating ...Wrangle 2016 - Digital Vulnerability: Characterizing Risks and Contemplating ...
Wrangle 2016 - Digital Vulnerability: Characterizing Risks and Contemplating ...WrangleConf
 
The Unreasonable Effectiveness of Product Sense
The Unreasonable Effectiveness of Product SenseThe Unreasonable Effectiveness of Product Sense
The Unreasonable Effectiveness of Product SenseWrangleConf
 
Wrangle 2016: Staying Hippocratic with High Stakes Data
Wrangle 2016: Staying Hippocratic with High Stakes DataWrangle 2016: Staying Hippocratic with High Stakes Data
Wrangle 2016: Staying Hippocratic with High Stakes DataWrangleConf
 
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlowWrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlowWrangleConf
 
Gene expression profiling in breast carcinoma
Gene expression profiling in breast carcinomaGene expression profiling in breast carcinoma
Gene expression profiling in breast carcinomaghoshparthanrs
 
A/B Testing at Pinterest: Building a Culture of Experimentation
A/B Testing at Pinterest: Building a Culture of Experimentation A/B Testing at Pinterest: Building a Culture of Experimentation
A/B Testing at Pinterest: Building a Culture of Experimentation WrangleConf
 

Destaque (20)

Python, Pharmaceuticals, and Drug Discovery by Emlyn Clay
Python, Pharmaceuticals, and Drug Discovery by Emlyn ClayPython, Pharmaceuticals, and Drug Discovery by Emlyn Clay
Python, Pharmaceuticals, and Drug Discovery by Emlyn Clay
 
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
 
Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor
Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor
Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor
 
Mistakes I've Made- Cam Davidson-Pilon
Mistakes I've Made- Cam Davidson-PilonMistakes I've Made- Cam Davidson-Pilon
Mistakes I've Made- Cam Davidson-Pilon
 
Sensor Data Wrangling: From Metal to Cloud
Sensor Data Wrangling: From Metal to CloudSensor Data Wrangling: From Metal to Cloud
Sensor Data Wrangling: From Metal to Cloud
 
Condense Fact from the Vapor of Nuance
Condense Fact from the Vapor of Nuance Condense Fact from the Vapor of Nuance
Condense Fact from the Vapor of Nuance
 
Wrangle 2016: Data Science for HR
Wrangle 2016: Data Science for HRWrangle 2016: Data Science for HR
Wrangle 2016: Data Science for HR
 
Wrangle 2016: Malware Tracking at Scale
Wrangle 2016: Malware Tracking at ScaleWrangle 2016: Malware Tracking at Scale
Wrangle 2016: Malware Tracking at Scale
 
Wrangle 2016: Driving Healthcare Operations with Small Data
Wrangle 2016: Driving Healthcare Operations with Small DataWrangle 2016: Driving Healthcare Operations with Small Data
Wrangle 2016: Driving Healthcare Operations with Small Data
 
From Science to Product (Company)
From Science to Product (Company)From Science to Product (Company)
From Science to Product (Company)
 
Wrangle 2016 - Digital Vulnerability: Characterizing Risks and Contemplating ...
Wrangle 2016 - Digital Vulnerability: Characterizing Risks and Contemplating ...Wrangle 2016 - Digital Vulnerability: Characterizing Risks and Contemplating ...
Wrangle 2016 - Digital Vulnerability: Characterizing Risks and Contemplating ...
 
The Unreasonable Effectiveness of Product Sense
The Unreasonable Effectiveness of Product SenseThe Unreasonable Effectiveness of Product Sense
The Unreasonable Effectiveness of Product Sense
 
Wrangle 2016: Staying Hippocratic with High Stakes Data
Wrangle 2016: Staying Hippocratic with High Stakes DataWrangle 2016: Staying Hippocratic with High Stakes Data
Wrangle 2016: Staying Hippocratic with High Stakes Data
 
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlowWrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
 
Histone protein
Histone proteinHistone protein
Histone protein
 
Cancer epigenetics
Cancer epigenetics Cancer epigenetics
Cancer epigenetics
 
Gene expression profiling i
Gene expression profiling  iGene expression profiling  i
Gene expression profiling i
 
Gene expression profiling in breast carcinoma
Gene expression profiling in breast carcinomaGene expression profiling in breast carcinoma
Gene expression profiling in breast carcinoma
 
A/B Testing at Pinterest: Building a Culture of Experimentation
A/B Testing at Pinterest: Building a Culture of Experimentation A/B Testing at Pinterest: Building a Culture of Experimentation
A/B Testing at Pinterest: Building a Culture of Experimentation
 
Precision Medicine World Conference 2017
Precision Medicine World Conference 2017Precision Medicine World Conference 2017
Precision Medicine World Conference 2017
 

Semelhante a Data Science in Drug Discovery

Repositioning Old Drugs For New Indications Using Computational Approaches
Repositioning Old Drugs For New Indications Using Computational ApproachesRepositioning Old Drugs For New Indications Using Computational Approaches
Repositioning Old Drugs For New Indications Using Computational ApproachesYannick Pouliot
 
Biomedical big data and research clinical application for obesity
Biomedical big data and research clinical application for obesityBiomedical big data and research clinical application for obesity
Biomedical big data and research clinical application for obesityHyung Jin Choi
 
Translational Genomics towards Personalized medicine - Medhavi Vashisth.ppt
Translational Genomics towards Personalized medicine - Medhavi Vashisth.pptTranslational Genomics towards Personalized medicine - Medhavi Vashisth.ppt
Translational Genomics towards Personalized medicine - Medhavi Vashisth.pptMedhavi27
 
의료 빅데이터와 인공지능의 현재와 미래
의료 빅데이터와 인공지능의 현재와 미래의료 빅데이터와 인공지능의 현재와 미래
의료 빅데이터와 인공지능의 현재와 미래Hyung Jin Choi
 
(서울의대 공유용) 빅데이터 분석 유전체 정보와 개인라이프로그 정보 활용-2015_11_24
(서울의대 공유용) 빅데이터 분석  유전체 정보와 개인라이프로그 정보 활용-2015_11_24(서울의대 공유용) 빅데이터 분석  유전체 정보와 개인라이프로그 정보 활용-2015_11_24
(서울의대 공유용) 빅데이터 분석 유전체 정보와 개인라이프로그 정보 활용-2015_11_24Hyung Jin Choi
 
20160119 디지털 헬스케어 의사모임 1월 전체 파일 v3
20160119 디지털 헬스케어 의사모임 1월 전체 파일 v320160119 디지털 헬스케어 의사모임 1월 전체 파일 v3
20160119 디지털 헬스케어 의사모임 1월 전체 파일 v3Chiweon Kim
 
DNA and Personalized medicine
DNA and Personalized medicineDNA and Personalized medicine
DNA and Personalized medicinecancerdrg
 
Medical Biotechnology (Recent Development)
Medical Biotechnology (Recent Development)Medical Biotechnology (Recent Development)
Medical Biotechnology (Recent Development)AnkitaShende1
 
BioVariance - Pediatric Pharmacogenomics in Drug Discovery
BioVariance - Pediatric Pharmacogenomics in Drug DiscoveryBioVariance - Pediatric Pharmacogenomics in Drug Discovery
BioVariance - Pediatric Pharmacogenomics in Drug DiscoveryJosef Scheiber
 
drug discovery
drug discoverydrug discovery
drug discoverymarymelna1
 
Solutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureSolutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureAffymetrix
 
2015 04-13 Pharma Nutrition 2015 Philadelphia Alain van Gool
2015 04-13 Pharma Nutrition 2015 Philadelphia Alain van Gool2015 04-13 Pharma Nutrition 2015 Philadelphia Alain van Gool
2015 04-13 Pharma Nutrition 2015 Philadelphia Alain van GoolAlain van Gool
 
Personalized Therapies for OA: Can Biomarkers Get Us There?
Personalized Therapies for OA: Can Biomarkers Get Us There?Personalized Therapies for OA: Can Biomarkers Get Us There?
Personalized Therapies for OA: Can Biomarkers Get Us There?OARSI
 
iCAAD London 2019 - Antonio Metastasio - PERSONALISED MEDICINE IN THE TREATM...
 iCAAD London 2019 - Antonio Metastasio - PERSONALISED MEDICINE IN THE TREATM... iCAAD London 2019 - Antonio Metastasio - PERSONALISED MEDICINE IN THE TREATM...
iCAAD London 2019 - Antonio Metastasio - PERSONALISED MEDICINE IN THE TREATM...iCAADEvents
 
Probiotic symposium chennai 3 dec 2016
Probiotic symposium chennai 3 dec 2016Probiotic symposium chennai 3 dec 2016
Probiotic symposium chennai 3 dec 2016neerjayakult
 
"Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ...
"Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ..."Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ...
"Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ...Hyper Wellbeing
 
Precision Medicine: Opportunities and Challenges for Clinical Trials
Precision Medicine: Opportunities and Challenges for Clinical TrialsPrecision Medicine: Opportunities and Challenges for Clinical Trials
Precision Medicine: Opportunities and Challenges for Clinical TrialsMedpace
 
biostatistics-220223232107.pdf
biostatistics-220223232107.pdfbiostatistics-220223232107.pdf
biostatistics-220223232107.pdfBagalanaSteven
 

Semelhante a Data Science in Drug Discovery (20)

Repositioning Old Drugs For New Indications Using Computational Approaches
Repositioning Old Drugs For New Indications Using Computational ApproachesRepositioning Old Drugs For New Indications Using Computational Approaches
Repositioning Old Drugs For New Indications Using Computational Approaches
 
Biomedical big data and research clinical application for obesity
Biomedical big data and research clinical application for obesityBiomedical big data and research clinical application for obesity
Biomedical big data and research clinical application for obesity
 
Translational Genomics towards Personalized medicine - Medhavi Vashisth.ppt
Translational Genomics towards Personalized medicine - Medhavi Vashisth.pptTranslational Genomics towards Personalized medicine - Medhavi Vashisth.ppt
Translational Genomics towards Personalized medicine - Medhavi Vashisth.ppt
 
의료 빅데이터와 인공지능의 현재와 미래
의료 빅데이터와 인공지능의 현재와 미래의료 빅데이터와 인공지능의 현재와 미래
의료 빅데이터와 인공지능의 현재와 미래
 
(서울의대 공유용) 빅데이터 분석 유전체 정보와 개인라이프로그 정보 활용-2015_11_24
(서울의대 공유용) 빅데이터 분석  유전체 정보와 개인라이프로그 정보 활용-2015_11_24(서울의대 공유용) 빅데이터 분석  유전체 정보와 개인라이프로그 정보 활용-2015_11_24
(서울의대 공유용) 빅데이터 분석 유전체 정보와 개인라이프로그 정보 활용-2015_11_24
 
20160119 디지털 헬스케어 의사모임 1월 전체 파일 v3
20160119 디지털 헬스케어 의사모임 1월 전체 파일 v320160119 디지털 헬스케어 의사모임 1월 전체 파일 v3
20160119 디지털 헬스케어 의사모임 1월 전체 파일 v3
 
DNA and Personalized medicine
DNA and Personalized medicineDNA and Personalized medicine
DNA and Personalized medicine
 
Medical Biotechnology (Recent Development)
Medical Biotechnology (Recent Development)Medical Biotechnology (Recent Development)
Medical Biotechnology (Recent Development)
 
BioVariance - Pediatric Pharmacogenomics in Drug Discovery
BioVariance - Pediatric Pharmacogenomics in Drug DiscoveryBioVariance - Pediatric Pharmacogenomics in Drug Discovery
BioVariance - Pediatric Pharmacogenomics in Drug Discovery
 
drug discovery
drug discoverydrug discovery
drug discovery
 
Solutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureSolutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochure
 
2015 04-13 Pharma Nutrition 2015 Philadelphia Alain van Gool
2015 04-13 Pharma Nutrition 2015 Philadelphia Alain van Gool2015 04-13 Pharma Nutrition 2015 Philadelphia Alain van Gool
2015 04-13 Pharma Nutrition 2015 Philadelphia Alain van Gool
 
Personalized Therapies for OA: Can Biomarkers Get Us There?
Personalized Therapies for OA: Can Biomarkers Get Us There?Personalized Therapies for OA: Can Biomarkers Get Us There?
Personalized Therapies for OA: Can Biomarkers Get Us There?
 
iCAAD London 2019 - Antonio Metastasio - PERSONALISED MEDICINE IN THE TREATM...
 iCAAD London 2019 - Antonio Metastasio - PERSONALISED MEDICINE IN THE TREATM... iCAAD London 2019 - Antonio Metastasio - PERSONALISED MEDICINE IN THE TREATM...
iCAAD London 2019 - Antonio Metastasio - PERSONALISED MEDICINE IN THE TREATM...
 
2013 05 society for clinical trials
2013 05 society for clinical trials2013 05 society for clinical trials
2013 05 society for clinical trials
 
Probiotic symposium chennai 3 dec 2016
Probiotic symposium chennai 3 dec 2016Probiotic symposium chennai 3 dec 2016
Probiotic symposium chennai 3 dec 2016
 
"Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ...
"Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ..."Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ...
"Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ...
 
2014 07 ismb personalized medicine
2014 07 ismb personalized medicine2014 07 ismb personalized medicine
2014 07 ismb personalized medicine
 
Precision Medicine: Opportunities and Challenges for Clinical Trials
Precision Medicine: Opportunities and Challenges for Clinical TrialsPrecision Medicine: Opportunities and Challenges for Clinical Trials
Precision Medicine: Opportunities and Challenges for Clinical Trials
 
biostatistics-220223232107.pdf
biostatistics-220223232107.pdfbiostatistics-220223232107.pdf
biostatistics-220223232107.pdf
 

Último

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 

Último (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 

Data Science in Drug Discovery

  • 1. Data Science in Drug Discovery Marina Sirota, PhD Assistant Professor Institute for Computational Health Sciences October 22, 2015
  • 4. Moore’s Law – Biology and Computation Cost Per Genome Cost of Computational Resources
  • 5. Can we use data integration to… Biomarker Discovery … to find better diagnostic markers? Disease Mechanism … understand disease better? Therapeutics … find new uses for existing drugs?
  • 6. Motivation • Problem: – Takes roughly 15 years and over $800 million to develop and bring a novel drug to market – 90% of drugs fail in early development • Solution: Drug Repurposing – Lower cost – Reduce risk of failure
  • 7. Problem Statement Can we use public data to systematically predict relationships between drugs and diseases? Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A, Sage J, Butte AJ. Discovery and Validation of Drug Indications Using Compendia of Public Gene Expression Data. Science Translational Medicine. Aug 2011.
  • 8. Problem Statement Can we use public data to systematically predict relationships between drugs and diseases? Diseases Drugs Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A, Sage J, Butte AJ. Discovery and Validation of Drug Indications Using Compendia of Public Gene Expression Data. Science Translational Medicine. Aug 2011.
  • 9. What is Gene Expression Profiling? • Global snapshot of cellular function and activity – Genome sequence – what might be going on – Expression – what is actually going on • 25,000 genes 1,000,000 proteins • We can measure a few thousand proteins, but gene expression is a global proxy How Can We Measure Expression?
  • 10. Microarrays • Thousands of probes are hybridized to a solid surface • Takes advantage of complementary DNA sequences • Process: – RNA is extracted from the sample – Fluorescent labeling – Hybridization and wash – Scanning and signal processing – Normalization and analysis!
  • 11. Data Sources • Collection of expression data from cultured human cells • 453 experiments of 164 drugs • Covers broad range of effects – FDA approved drugs – Non drug bioactive small molecules • Publicly available gene expression repository – Platforms – 11,745 – Samples – 961,202 – Series -39,679 • There are numerous experiments dealing with over 200 diseasesBarrett et al. NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res. 2009. Lamb et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science. 2006.
  • 12. Disease Gene Expression Data (GEO) Butte AJ, Chen R. AMIA, 2006. Download all GDS Experiments GEO Identify Disease Associated Experiments Identify Normal vs. Disease Experiments 176 datasets, 3113 arrays, 100 diseases Dudley J, Butte AJ. PSB, 2008. Dudley JT, Tibshirani R, Deshpande T, Butte AJ. Disease signatures are robust across tissues and experiments. Mol
  • 14. Drug Gene Expression Profile Treated Sample Untreated Sample Drug Gene Expression Profile
  • 15. Up-regulated Down-regulated Hypothesis Gene Expression Profiles Disease Drug BDisease Drug A Disease Drug C Genes Genes Genes Treatment Adverse Reaction ? ???? Lamb et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease.
  • 16. Computational Pipeline Disease Gene Expression Signature Genes Drugs Disease-Drug Scores Drugs Similar to Disease Drugs Opposite to Disease
  • 17. Drug-Disease Relationships Drugs Diseases Positive Correlation – Adverse Reaction? Negative Correlation - Therapeutic
  • 18. Similar Diseases Cluster Based on Disease-Drug Similarity
  • 19. Families of Drugs Cluster Based on Disease-Drug Similarity
  • 20. Families of Drugs Cluster Based on Disease-Drug Similarity
  • 21. Drug-Disease Relationships Drugs Diseases Positive Correlation – Adverse Reaction? Negative Correlation - Therapeutic
  • 22. Crohn’s Disease • An inflammatory disease of the intestines that has an autoimmune component • Affects 500,000 people in North America • No known pharmaceutical cure • Current solutions: – Reduce inflammation with anti- inflammatory drugs and corticosteroids (prednisone) – Bad side effects – Surgical solutions
  • 25. Topiramate – An Anti-Seizure Drug • Suppresses the rapid and excessive firing of neurons that start a seizure • Enhances GABA-activation • Used to treat epilepsy, bipolar disorder • Antidepressant • Investigated as potential treatment for obesity and type II diabetes
  • 26. Topiramate and Crohn’s Genes that are up-regulated by the drug are down-regulated in the disease Genes that are down-regulated by the drug ar up-regulated in the disease
  • 27. Animal Model for Crohn’s • TNBS (trinitrobenzene sulfonic acid) + ethanol induced rats: – Excellent and reproducible experimental model for Inflammatory Bowel Disease (Crohn’s and Ulcerative Colitis) – Toxin-based model Normal TNBS Induced
  • 28. Pilot Validation Study Design • Pilot Study – 18 rats – Healthy (control) – TNBS-Induced Untreated – TNBS-Induced Treated • 80 mg/kg topiramate, injected daily • Colon tissue macroscopic damage score Reetesh Pai, Mohan Shenoy and Pankaj Jay
  • 30. Two Follow-up Validation Studies • 48 rats each – 4 groups of 12 rats – Healthy Controls – TNBS + Vehicle – TNBS + Prednisolone – TNBS + Topiramate • 7 days • Clinical Signs, Pathology Score, Histology • Endoscopy Images
  • 36. Ongoing work • Extending the drug datasets to use structural data • Incorporating meta-analysis methods • Application to cancer (lung cancer, liver cancer, medulloblastoma) • More focused cell line selection • Looking at dosage response and combination therapy prediction • Leveraging EMR and clinical trial dataChen B, Sirota M, Fan-Minogue H, Hadley D, Butte AJ. Relating Hepatocellular Carcinoma Tumor Samples and Cell Lines Using Gene Expression Data in Translational Research. BMC Medical Genomics, 2015. Wu M, Sirota M, Butte AJ, Chen B. Characteristics of drug combination therapy in oncology by analyzing clinical trial data on clinicaltrials.gov. Pac Symp Biocomput. 2015.
  • 37. Can we use data integration to… Biomarker Discovery … to find better diagnostic markers? Disease Mechanism … understand disease better? Therapeutics … find new uses for existing drugs?
  • 39. Acknowledgements Atul Butte Joel Dudley Annie P. Chiang Alex Morgan Pankaj Jay Pasricha Mohan Shenoy Minnie Sarwal Reetesh Pai Julien Sage Silke Roedder Alejandro Sweet- Cordero Bin Chen Hanna Paik Dexter Hadley
  • 40. Institute for Computational Health Sciences @ UCSF

Notas do Editor

  1. Good morning my name is Marina Sirota. I’m currently a lead research scientist in the division of systems medicine at Stanford university. Previously I worked at Pfizer under David Cox in a genetics group working on applying next gen sequencing technologies to discover novel drug targets and develop population stratification techniques for clinical trials. Today I will tell you a bit about translational bioinformatics, systems medicine and how it might impact transplantation practice in the near future
  2. Since then we have come a looong way. Thousands of people have been sequenced and millions of individuals have been genotyped. These are all resources that have been created and most importantly they are open to the public.
  3. People are also starting to use sequencing in creative ways – immunome, metagenome, epigenetics cell-free DNA.
  4. The observation made in 1965 by Gordon Moore, co-founder of Intel, that the number of transistors per square inch on integrated circuits had doubled every year since the integrated circuit was invented. Moore predicted that this trend would continue for the foreseeable future.
  5. Drug repurposing is finding a novel indication for a known FDA approved drugs Computational work can be instrumental in making this feasible Examples: Viagra - was initially studied for use in hypertension
  6. So far I have focused on the genetics piece or the DNA, I would like to talk a little bit about RNA or the middle piece of the central dogma and ways we can measure it using gene expression profiling Expression profiles can, for example, distinguish between cells that are actively dividing, or show how the cells react to a particular treatment Used to generate hypothesis, mechanism of action Post translational modification, alternative splicing
  7. Microarrays are one technology to measure gene expression. They were first developed in 1995, nearly 20 years ago.
  8. Have GEO data, but the format doesn’t make it easy to ask relevant questions We have built an infrastructure to enable this sort of analysis
  9. This approach is especially likely to yield good results for diseases with a strong gene dis-regulation component such as autoimmune disease or cancer
  10. If the up-regulated disease genes appear near the top (up-regulated) of the rank-ordered drug gene expression list and the down-regulated disease genes fall near the bottom (down-regulated) of the rank-ordered drug gene expression list we can conclude that the drug and the disease expression profiles are similar if the up-regulated disease genes fall near the bottom of the rank-ordered drug gene expression list and the down-regulated disease genes are near the top of the rank-ordered drug gene expression - therapeutic Randomization by picking a signature at random and recomputing drug disease scores 100 times FDR
  11. 100 diseases 164 drugs 16000 drug-disease pairs 53 diseases significant predictions Not everything is treatable
  12. Hieararchical clustering Brain cancers Other Cancers Lung Injury UC and crohn’s
  13. histone deacetylase (HDAC) inhibitors (in red) Drugs known to affect different parts of the same pathway also cluster together: phosphatidylinositol-3-kinase (PI3K) inhibitors LY−294002 and wortmannin (in green)
  14. heat shock protein 90 (HSP90) inhibitors (in orange)
  15. Chose Crohn’s but have others
  16. Known drug Two that are better One is FDA approved so go for this one
  17. Looked for an animal model of Crohn’s and found one
  18. Define macroscopic damage score Scale 0-6 what they mean Define axes
  19. Earlier this year President Obama launched a $215 million investment in Precision Medicine Initiative will pioneer a new model of patient-powered research that promises to accelerate biomedical discoveries and provide clinicians with new tools, knowledge, and therapies to select which treatments will work best for which patients.