SlideShare uma empresa Scribd logo
1 de 20
gsk.com
AI & Big Data Expo, London
Machine learning, biomedical data & trust
Paul Agapow (Statistics & Data Science Innovation Hub)
Background & disclaimer
• Previously a health informatician, biomedical ML
researcher, bioinformatician, “computer guy”,
disease chaser, epi-informatician,
phylogeneticist, evolutionary biologist,
immunologist, biochemist …
• Now a director @GSK
• This presentation does not reflect thought,
policy or projects in progress at GSK
• There are no conflicts of interest
10 June 2021 3
“AI will not replace
drug hunters, but drug
hunters who don’t use
AI will be replaced by
those who do.”
-Andrew Hopkins, CEO Exscientia
4
5
07 February 2023
3 hurdles to using AI/ML in therapy development
Biological & physiological
complexity
Insufficient & uneven data
A gap between AI/ML practice &
medical needs
To make a
new drug,
you must
first solve for
everything
6
12 July 2021 7
The complexity of biology:
About 50 trillion cells of 200 types
Each cell has 23 pairs of chromosomes
In total 6.4 billion basepairs (positions)
Organised into about 18,000 genes
(Or maybe more like 40,000 genes)
Genetic material elsewhere in the cell
Epigenetic modification
1 million different types of molecules
Lifestyle & history
Exposure & environment
Immune system repertoire & priming
…
Of which we know only a fraction
The data types and sources we need are myriad & varied
8
Hughes et al. (2010) ”Principles of early drug discovery”
• There are many different
modalities of intervention
• With different (data)
considerations & different
levels of ML experience
07 February 2023 9
There are many different means to the same end
McKinsey, EvaluatePharma 2022
It’s often not
the right data
• Difficult / expensive to generate
• Unstructured
• Unlabeled
• The wrong type
• Sparse, unevenly sampled
• WEIRD
• In different formats and silos
10
07 February 2023 11
Melanie Mitchell via Dagmar Monett
A disconnect between AI/ML practice and medical needs
Academic focus on problems with low medical value
• There are many models
that work perfectly … in
the lab
• Why?
- Unrealistic or poor
training data
- Emphasis on hitting
metrics
07 February 2023 12
A disconnect between AI/ML practice and medical needs
A tendency to treat biomedicine as simply a data / ML problem
The classic
analytical
tension
13
What we need to solve
What we tend to solve
Easy things
Available, ideal data
Ground truth
Simplify
“Interesting”
“Table-land”
Useful things
Incomplete messy data
Unclear biological reality
Uncertain findings
Needful
“Network-land”
14
Laure Wynants via Maarten van Smeden
A disconnect between AI/ML practice and medical needs
Many ”good” models are not fit for production
07 February 2023 15
• The pandemic prompted a flood of publications &
preprints
• Most plagued by the usual biomedical AI problems
• … and also produced by those outside the field
• As a general principle, any paper applying ML to COVID
is terrible
• Bad models in a crisis situation are not neutral, they
distract, expend effort, are an opportunity cost
COVID was a lightning rod for bad biomedical ML
07 February 2023 16
• What does it purport to do: Find risk factors
associated with deterioration of COVID patients
• Why? Better / faster assessment of incoming
patients
• Who? Patients admitted to two hospitals with +ve
PCR test for COVID with CT scan with lesions
• Data? Demographics, bloods, labs, breathing/
oxygen scores, CT scans manually scored
“Interpretable Prediction of Severity & Crucial Factors of COVID Patients”
Zheng et al. BioMed Research International (2021), DOI: 10.1155/2021/8840835
07 February 2023 17
• Conflates diagnosis & prognosis
• The cohort:
- Suggested this can replace PCR but cohort are selected
by PCR result
- The act of taking a CT scan in some ways selects for
cohort
- Unclear when some readings taken, when we are looking
at deterioration
- Are the training set the set that a model might be used on
in the clinic?
- Not many critical – so actually testing for severe cases
- What’s the split between hospitals
- Patients are different already, pre-existing conditions
- Association with age & general health
- Old patients running a temperature with lesioned lungs do
poorly
• Clinical use:
- Will all this data be available in a timely fashion for a
model in the clinic
- If the severity is based of bloods & oxygenation readings,
why not just use them
- Information complexity?
• Validation:
- Would it work for another time period at same hospitals?
At other hospitals?
• Analytics
- “The impenetrable wall of math”
- XGBoost is always a good place to start
- Ensemble methods usually are
- Feature interaction?
- Some features overlap (neutrophils, n. ratio, NLR)
- What features correlate?
- No attempt to simplify model
- Any model is interpretable with SHAP
• Still useful for intrinsic / research purposes
Thoughts and questions
Not necessarily faults, not all easily answerable
07 February 2023 18
• Models will always tell you the truth
- But it’s the truth conditioned on the data they’ve seen
- It might not be the truth you think
• Biomedical data is complex, it always come with a context
• Patients are complex, they always come with a medical history
• How were these patients selected?
• What is this model actually saying and why?
• Does this model replicate in other populations?
• But despite all this, we have to make and actionably interpret
models
Some principles for better biomedical ML
Click to enter
title here
Why not join us?
19
Academic Press (2021)
Click to enter
title here
Some light
reading
20
Academic Press (2021)

Mais conteúdo relacionado

Semelhante a ML, biomedical data & trust

ai-in-healthcare-202011-201117103639.pptx
ai-in-healthcare-202011-201117103639.pptxai-in-healthcare-202011-201117103639.pptx
ai-in-healthcare-202011-201117103639.pptx
ssuser6b571f
 
Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...
Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...
Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...
Manuel GEA - Bio-Modeling Systems
 
grandroundsonai-190917135538.pdf
grandroundsonai-190917135538.pdfgrandroundsonai-190917135538.pdf
grandroundsonai-190917135538.pdf
UmayKulsoom2
 
[DSC Europe 23][DigiHealth] Dimitrios Kalogeropoulos A Sustainable Future for...
[DSC Europe 23][DigiHealth] Dimitrios Kalogeropoulos A Sustainable Future for...[DSC Europe 23][DigiHealth] Dimitrios Kalogeropoulos A Sustainable Future for...
[DSC Europe 23][DigiHealth] Dimitrios Kalogeropoulos A Sustainable Future for...
DataScienceConferenc1
 

Semelhante a ML, biomedical data & trust (20)

Multi-omics for drug discovery: what we lose, what we gain
Multi-omics for drug discovery: what we lose, what we gainMulti-omics for drug discovery: what we lose, what we gain
Multi-omics for drug discovery: what we lose, what we gain
 
ML & AI in Drug development: the hidden part of the iceberg
ML & AI in Drug development: the hidden part of the icebergML & AI in Drug development: the hidden part of the iceberg
ML & AI in Drug development: the hidden part of the iceberg
 
ai-in-healthcare-202011-201117103639.pptx
ai-in-healthcare-202011-201117103639.pptxai-in-healthcare-202011-201117103639.pptx
ai-in-healthcare-202011-201117103639.pptx
 
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
 
Big Data: Learning from MIMIC- Celi
Big Data: Learning from MIMIC- CeliBig Data: Learning from MIMIC- Celi
Big Data: Learning from MIMIC- Celi
 
Big Data & ML for Clinical Data
Big Data & ML for Clinical DataBig Data & ML for Clinical Data
Big Data & ML for Clinical Data
 
Diabetes Data Science
Diabetes Data ScienceDiabetes Data Science
Diabetes Data Science
 
Atul Butte NIPS 2017 ML4H
Atul Butte NIPS 2017 ML4HAtul Butte NIPS 2017 ML4H
Atul Butte NIPS 2017 ML4H
 
Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...
Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...
Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...
 
MDC Connects Series 2021 | A Guide to Complex Medicines: Developing the assay...
MDC Connects Series 2021 | A Guide to Complex Medicines: Developing the assay...MDC Connects Series 2021 | A Guide to Complex Medicines: Developing the assay...
MDC Connects Series 2021 | A Guide to Complex Medicines: Developing the assay...
 
Str-AI-ght to heaven? Pitfalls for clinical decision support based on AI
Str-AI-ght to heaven? Pitfalls for clinical decision support based on AIStr-AI-ght to heaven? Pitfalls for clinical decision support based on AI
Str-AI-ght to heaven? Pitfalls for clinical decision support based on AI
 
An Introduction to Artificial Intelligence for the Everyday Radiologist
An Introduction to Artificial Intelligence for the Everyday RadiologistAn Introduction to Artificial Intelligence for the Everyday Radiologist
An Introduction to Artificial Intelligence for the Everyday Radiologist
 
grandroundsonai-190917135538.pdf
grandroundsonai-190917135538.pdfgrandroundsonai-190917135538.pdf
grandroundsonai-190917135538.pdf
 
Artificial Intelligence and ChatGPT: Impacts and Challenges for Medical Educa...
Artificial Intelligence and ChatGPT: Impacts and Challenges for Medical Educa...Artificial Intelligence and ChatGPT: Impacts and Challenges for Medical Educa...
Artificial Intelligence and ChatGPT: Impacts and Challenges for Medical Educa...
 
Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?
 
[DSC Europe 23][DigiHealth] Dimitrios Kalogeropoulos A Sustainable Future for...
[DSC Europe 23][DigiHealth] Dimitrios Kalogeropoulos A Sustainable Future for...[DSC Europe 23][DigiHealth] Dimitrios Kalogeropoulos A Sustainable Future for...
[DSC Europe 23][DigiHealth] Dimitrios Kalogeropoulos A Sustainable Future for...
 
AstraZeneca - The promise of graphs & graph-based learning in drug discovery
AstraZeneca - The promise of graphs & graph-based learning in drug discoveryAstraZeneca - The promise of graphs & graph-based learning in drug discovery
AstraZeneca - The promise of graphs & graph-based learning in drug discovery
 
2023-11-09 HealthRI Biobanking day_Amsterdam_Alain van Gool.pdf
2023-11-09 HealthRI Biobanking day_Amsterdam_Alain van Gool.pdf2023-11-09 HealthRI Biobanking day_Amsterdam_Alain van Gool.pdf
2023-11-09 HealthRI Biobanking day_Amsterdam_Alain van Gool.pdf
 
Atul Butte's presentation to the Association of Medical School Pediatric Depa...
Atul Butte's presentation to the Association of Medical School Pediatric Depa...Atul Butte's presentation to the Association of Medical School Pediatric Depa...
Atul Butte's presentation to the Association of Medical School Pediatric Depa...
 
The reality of moving towards precision medicine
The reality of moving towards precision medicineThe reality of moving towards precision medicine
The reality of moving towards precision medicine
 

Mais de Paul Agapow

Mais de Paul Agapow (12)

Digital Biomarkers, a (too) brief introduction.pdf
Digital Biomarkers, a (too) brief introduction.pdfDigital Biomarkers, a (too) brief introduction.pdf
Digital Biomarkers, a (too) brief introduction.pdf
 
How to make every mistake and still have a career, Feb2024.pdf
How to make every mistake and still have a career, Feb2024.pdfHow to make every mistake and still have a career, Feb2024.pdf
How to make every mistake and still have a career, Feb2024.pdf
 
Get yourself a better bioinformatics job
Get yourself a better bioinformatics jobGet yourself a better bioinformatics job
Get yourself a better bioinformatics job
 
Bioinformatics! (What is it good for?)
Bioinformatics! (What is it good for?)Bioinformatics! (What is it good for?)
Bioinformatics! (What is it good for?)
 
Machine Learning for Preclinical Research
Machine Learning for Preclinical ResearchMachine Learning for Preclinical Research
Machine Learning for Preclinical Research
 
AI for Precision Medicine (Pragmatic preclinical data science)
AI for Precision Medicine (Pragmatic preclinical data science)AI for Precision Medicine (Pragmatic preclinical data science)
AI for Precision Medicine (Pragmatic preclinical data science)
 
Patient subtypes: real or not?
Patient subtypes: real or not?Patient subtypes: real or not?
Patient subtypes: real or not?
 
Big biomedical data is a lie
Big biomedical data is a lieBig biomedical data is a lie
Big biomedical data is a lie
 
eTRIKS at Pharma IT 2017, London
eTRIKS at Pharma IT 2017, LondoneTRIKS at Pharma IT 2017, London
eTRIKS at Pharma IT 2017, London
 
Introduction to Snakemake
Introduction to SnakemakeIntroduction to Snakemake
Introduction to Snakemake
 
Analysing biomedical data (ers october 2017)
Analysing biomedical data (ers  october 2017)Analysing biomedical data (ers  october 2017)
Analysing biomedical data (ers october 2017)
 
Interpreting transcriptomics (ers berlin 2017)
Interpreting transcriptomics (ers berlin 2017)Interpreting transcriptomics (ers berlin 2017)
Interpreting transcriptomics (ers berlin 2017)
 

Último

Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
amritaverma53
 
Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...
Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...
Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...
Sheetaleventcompany
 
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
Sheetaleventcompany
 

Último (20)

❤️Chandigarh Escorts Service☎️9814379184☎️ Call Girl service in Chandigarh☎️ ...
❤️Chandigarh Escorts Service☎️9814379184☎️ Call Girl service in Chandigarh☎️ ...❤️Chandigarh Escorts Service☎️9814379184☎️ Call Girl service in Chandigarh☎️ ...
❤️Chandigarh Escorts Service☎️9814379184☎️ Call Girl service in Chandigarh☎️ ...
 
Low Cost Call Girls Bangalore {9179660964} ❤️VVIP NISHA Call Girls in Bangalo...
Low Cost Call Girls Bangalore {9179660964} ❤️VVIP NISHA Call Girls in Bangalo...Low Cost Call Girls Bangalore {9179660964} ❤️VVIP NISHA Call Girls in Bangalo...
Low Cost Call Girls Bangalore {9179660964} ❤️VVIP NISHA Call Girls in Bangalo...
 
Call Girl In Chandigarh 📞9809698092📞 Just📲 Call Inaaya Chandigarh Call Girls ...
Call Girl In Chandigarh 📞9809698092📞 Just📲 Call Inaaya Chandigarh Call Girls ...Call Girl In Chandigarh 📞9809698092📞 Just📲 Call Inaaya Chandigarh Call Girls ...
Call Girl In Chandigarh 📞9809698092📞 Just📲 Call Inaaya Chandigarh Call Girls ...
 
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
 
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room DeliveryCall 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
 
💰Call Girl In Bangalore☎️7304373326💰 Call Girl service in Bangalore☎️Bangalor...
💰Call Girl In Bangalore☎️7304373326💰 Call Girl service in Bangalore☎️Bangalor...💰Call Girl In Bangalore☎️7304373326💰 Call Girl service in Bangalore☎️Bangalor...
💰Call Girl In Bangalore☎️7304373326💰 Call Girl service in Bangalore☎️Bangalor...
 
Call Girls Bangalore - 450+ Call Girl Cash Payment 💯Call Us 🔝 6378878445 🔝 💃 ...
Call Girls Bangalore - 450+ Call Girl Cash Payment 💯Call Us 🔝 6378878445 🔝 💃 ...Call Girls Bangalore - 450+ Call Girl Cash Payment 💯Call Us 🔝 6378878445 🔝 💃 ...
Call Girls Bangalore - 450+ Call Girl Cash Payment 💯Call Us 🔝 6378878445 🔝 💃 ...
 
Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...
Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...
Race Course Road } Book Call Girls in Bangalore | Whatsapp No 6378878445 VIP ...
 
Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...
Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...
Gorgeous Call Girls Dehradun {8854095900} ❤️VVIP ROCKY Call Girls in Dehradun...
 
Call Girls Kathua Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kathua Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Kathua Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kathua Just Call 8250077686 Top Class Call Girl Service Available
 
Cardiac Output, Venous Return, and Their Regulation
Cardiac Output, Venous Return, and Their RegulationCardiac Output, Venous Return, and Their Regulation
Cardiac Output, Venous Return, and Their Regulation
 
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptxANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
 
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
 
Ahmedabad Call Girls Book Now 9630942363 Top Class Ahmedabad Escort Service A...
Ahmedabad Call Girls Book Now 9630942363 Top Class Ahmedabad Escort Service A...Ahmedabad Call Girls Book Now 9630942363 Top Class Ahmedabad Escort Service A...
Ahmedabad Call Girls Book Now 9630942363 Top Class Ahmedabad Escort Service A...
 
Call Girls in Lucknow Just Call 👉👉8630512678 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉8630512678 Top Class Call Girl Service Avai...Call Girls in Lucknow Just Call 👉👉8630512678 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉8630512678 Top Class Call Girl Service Avai...
 
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
Goa Call Girl Service 📞9xx000xx09📞Just Call Divya📲 Call Girl In Goa No💰Advanc...
 
Independent Bangalore Call Girls (Adult Only) 💯Call Us 🔝 7304373326 🔝 💃 Escor...
Independent Bangalore Call Girls (Adult Only) 💯Call Us 🔝 7304373326 🔝 💃 Escor...Independent Bangalore Call Girls (Adult Only) 💯Call Us 🔝 7304373326 🔝 💃 Escor...
Independent Bangalore Call Girls (Adult Only) 💯Call Us 🔝 7304373326 🔝 💃 Escor...
 
Call girls Service Phullen / 9332606886 Genuine Call girls with real Photos a...
Call girls Service Phullen / 9332606886 Genuine Call girls with real Photos a...Call girls Service Phullen / 9332606886 Genuine Call girls with real Photos a...
Call girls Service Phullen / 9332606886 Genuine Call girls with real Photos a...
 
Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...
Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...
Gastric Cancer: Сlinical Implementation of Artificial Intelligence, Synergeti...
 
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
Chandigarh Call Girls Service ❤️🍑 9809698092 👄🫦Independent Escort Service Cha...
 

ML, biomedical data & trust

  • 1. gsk.com AI & Big Data Expo, London Machine learning, biomedical data & trust Paul Agapow (Statistics & Data Science Innovation Hub)
  • 2. Background & disclaimer • Previously a health informatician, biomedical ML researcher, bioinformatician, “computer guy”, disease chaser, epi-informatician, phylogeneticist, evolutionary biologist, immunologist, biochemist … • Now a director @GSK • This presentation does not reflect thought, policy or projects in progress at GSK • There are no conflicts of interest
  • 3. 10 June 2021 3 “AI will not replace drug hunters, but drug hunters who don’t use AI will be replaced by those who do.” -Andrew Hopkins, CEO Exscientia
  • 4. 4
  • 5. 5 07 February 2023 3 hurdles to using AI/ML in therapy development Biological & physiological complexity Insufficient & uneven data A gap between AI/ML practice & medical needs
  • 6. To make a new drug, you must first solve for everything 6
  • 7. 12 July 2021 7 The complexity of biology: About 50 trillion cells of 200 types Each cell has 23 pairs of chromosomes In total 6.4 billion basepairs (positions) Organised into about 18,000 genes (Or maybe more like 40,000 genes) Genetic material elsewhere in the cell Epigenetic modification 1 million different types of molecules Lifestyle & history Exposure & environment Immune system repertoire & priming … Of which we know only a fraction
  • 8. The data types and sources we need are myriad & varied 8 Hughes et al. (2010) ”Principles of early drug discovery”
  • 9. • There are many different modalities of intervention • With different (data) considerations & different levels of ML experience 07 February 2023 9 There are many different means to the same end McKinsey, EvaluatePharma 2022
  • 10. It’s often not the right data • Difficult / expensive to generate • Unstructured • Unlabeled • The wrong type • Sparse, unevenly sampled • WEIRD • In different formats and silos 10
  • 11. 07 February 2023 11 Melanie Mitchell via Dagmar Monett A disconnect between AI/ML practice and medical needs Academic focus on problems with low medical value
  • 12. • There are many models that work perfectly … in the lab • Why? - Unrealistic or poor training data - Emphasis on hitting metrics 07 February 2023 12 A disconnect between AI/ML practice and medical needs A tendency to treat biomedicine as simply a data / ML problem
  • 13. The classic analytical tension 13 What we need to solve What we tend to solve Easy things Available, ideal data Ground truth Simplify “Interesting” “Table-land” Useful things Incomplete messy data Unclear biological reality Uncertain findings Needful “Network-land”
  • 14. 14 Laure Wynants via Maarten van Smeden A disconnect between AI/ML practice and medical needs Many ”good” models are not fit for production
  • 15. 07 February 2023 15 • The pandemic prompted a flood of publications & preprints • Most plagued by the usual biomedical AI problems • … and also produced by those outside the field • As a general principle, any paper applying ML to COVID is terrible • Bad models in a crisis situation are not neutral, they distract, expend effort, are an opportunity cost COVID was a lightning rod for bad biomedical ML
  • 16. 07 February 2023 16 • What does it purport to do: Find risk factors associated with deterioration of COVID patients • Why? Better / faster assessment of incoming patients • Who? Patients admitted to two hospitals with +ve PCR test for COVID with CT scan with lesions • Data? Demographics, bloods, labs, breathing/ oxygen scores, CT scans manually scored “Interpretable Prediction of Severity & Crucial Factors of COVID Patients” Zheng et al. BioMed Research International (2021), DOI: 10.1155/2021/8840835
  • 17. 07 February 2023 17 • Conflates diagnosis & prognosis • The cohort: - Suggested this can replace PCR but cohort are selected by PCR result - The act of taking a CT scan in some ways selects for cohort - Unclear when some readings taken, when we are looking at deterioration - Are the training set the set that a model might be used on in the clinic? - Not many critical – so actually testing for severe cases - What’s the split between hospitals - Patients are different already, pre-existing conditions - Association with age & general health - Old patients running a temperature with lesioned lungs do poorly • Clinical use: - Will all this data be available in a timely fashion for a model in the clinic - If the severity is based of bloods & oxygenation readings, why not just use them - Information complexity? • Validation: - Would it work for another time period at same hospitals? At other hospitals? • Analytics - “The impenetrable wall of math” - XGBoost is always a good place to start - Ensemble methods usually are - Feature interaction? - Some features overlap (neutrophils, n. ratio, NLR) - What features correlate? - No attempt to simplify model - Any model is interpretable with SHAP • Still useful for intrinsic / research purposes Thoughts and questions Not necessarily faults, not all easily answerable
  • 18. 07 February 2023 18 • Models will always tell you the truth - But it’s the truth conditioned on the data they’ve seen - It might not be the truth you think • Biomedical data is complex, it always come with a context • Patients are complex, they always come with a medical history • How were these patients selected? • What is this model actually saying and why? • Does this model replicate in other populations? • But despite all this, we have to make and actionably interpret models Some principles for better biomedical ML
  • 19. Click to enter title here Why not join us? 19 Academic Press (2021)
  • 20. Click to enter title here Some light reading 20 Academic Press (2021)