SlideShare uma empresa Scribd logo
1 de 39
Health data science
Why study data science?
Why study data science?
What is health data science?
• Data-driven solution to solve complex real world health problems
• Or to derive knowledge from unstructured and messy data
• It is an interdisciplinary field: biostatistics, computer science,
epidemiology, public health, mathematics, etc
But basically…
Real life health data science example
• HIV:
• Visualising the pattern of early HIV transmission within the mucosal barrier
• COVID-19:
• What can predict covid-19 neutralisation activity?
• Can we predict covid-19 vaccine efficacy?
Early HIV transmission
dynamics
Background
• Early HIV transmission event might occur during vaginal or anal sex
• Want to investigate if the mucosal barrier (within the vaginal tissue) is
effective in blocking HIV virus transmission or not
If the mucosal barrier is good in preventing viral
transmission, this is what we expect to see
If the mucosal barrier is not good at preventing
transmission, multiple viruses can be found
(random infection)
If the mucosal barrier is not good at preventing
transmission, multiple viruses can be found
(clustered infection)
Animal experiment
Data
14
Data Visualisation
Can still see many viral variants
no evidence that the vaginal tissue
is effective in blocking viral entry
Need a formal method
• How can we say (formally) if infection is spatially clustered (or not) ?
• Mantel test (or Mantel and Valand) -> relate a matrix of
“geographical” distance and a matrix of “biological” distance
• So, need to define the “geographical” matrix and “biological” matrix
first
15
“Geographical” distance
• Euclidean distance
di, j = (xi - xj )2
+(yi - yj )2
16
“Biological” distance
• Morisita – Horn index of overlap
MH =
2
n1in2i
N1N2
i
å
n1i
2
N1
+
n2i
2
N2
i
å
17
“Biological” distance
• Similarity between 1 and 2 =
0.98
• Similarity between 1 and 3 =
0.46
18
Mantel Test (or Mantel and Valand)
• Testing the association between two matrices
• Mantel quantity (Zm) is given by:
• Basic idea -> permutation test
• Randomly changing the rows and columns of the two matrices
• And store the value of Zm for each permutation of rows and columns
Zm = gij
j
å
i
å bij
19
20
Low p-values: infection is clustered locally
within the vaginal tissue
What can predict covid-19
viral neutralisation activity?
Background
• Neutralising antibody (NAb): antibody that can defend the host from
the specific pathogen
• Data: 41 convalescent adults; measured several immunological
parameters (13 parameters total)
• Goal: want to know in those 41 recovered patients, what
immunological parameters can be used to predict NAb
Methods
• Data visualisation is very important in data science
• First step: plot the correlation matrix for the whole dataset
Microneutralization is positively correlated
with SARS-CoV-2 RBD
Microneutralization is negatively correlated
with CCR6+CXCR3-
Ok, not very informative….
Have so many things correlated with microneutralization
Methods
• Correlation matrix shows that Nab is correlated with so many things
• Next step: Can I find some hidden features in this dataset?
• Method: principal component analysis (PCA)
The main focus is microneutralization
If the angle between microneut and another variable is less
than 90o; then it’s a positive association
If the angle between microneut and another variable is greater
than 90o; then it’s a negative association
For instance, higher ELISA S trimer gives higher
microneutralization level (less than 90o)
For instance, higher CCR6+CXCR3- gives lower
microneutralization level (more than 90o)
Methods
• PCA visualisation is better than correlation matrix
• But, still cannot just pick one thing that can be used to predict NAb
• Next step: I want to only pick one thing to predict NAb
• Method: multiple linear regression with a backward model selection
strategy
• The idea is to run a linear regression with all the variables, and iteratively
remove non-significant predictor until all the predictors are significant
Two main things are highly predictive of NAb
Predicting covid-19
vaccine efficacy
Background
Background
• At the end of the phase 2 trial, we get the immunogenicity data
(measuring the amount of antibody)
• Given the data from phase 2 trial (antibody data), can we predict
what the efficacy of the vaccine will be?
• Training dataset: efficacy and antibody data from all available vaccines
Methods
• The first step is always to visualise your data, so why don’t we plot
efficacy against antibody first?
High antibody = high efficacy
Low antibody = low efficacy
Can we simply do a classification method based on the
level of antibody?
Methods
• The model is a distribution-free binary classification model, based on
the threshold level of antibody
• The lower your antibody level, higher chance for you to be infected,
so the vaccine efficacy will be lower
• The higher your antibody level, lower chance for you to be infected,
so the vaccine efficacy will be higher
• We want to know what is this threshold of antibody
We normalised the antibody to the convalescent patients
(the mean for convalescent is one)
Covaxin data came out a bit later, so we used covaxin to
validate our ‘classifier’ model
Using our classifier, as long as we have antibody data (from
phase 2 trial), we can predict any vaccine efficacy
CureVac mRNA vaccine failure – why???
Simple data visualisation can help to answer
Because lower dose than Pfizer and Moderna

Mais conteúdo relacionado

Semelhante a Health data science.pptx

STDS- recent diagnosis methods@1223.pptx
STDS- recent diagnosis methods@1223.pptxSTDS- recent diagnosis methods@1223.pptx
STDS- recent diagnosis methods@1223.pptxKamalJungShahi
 
Laboratory monitoring of Progression of HIV
Laboratory monitoring of  Progression of HIVLaboratory monitoring of  Progression of HIV
Laboratory monitoring of Progression of HIVAnkita Mohanty
 
Effect of healthy diet on covid-19
Effect of healthy diet on covid-19Effect of healthy diet on covid-19
Effect of healthy diet on covid-19saimashahab1
 
Development of monoclonal antibodies Workshop
Development of monoclonal antibodies WorkshopDevelopment of monoclonal antibodies Workshop
Development of monoclonal antibodies WorkshopAngel Hernández
 
Biostatistics and Statistical Bioinformatics
Biostatistics and Statistical BioinformaticsBiostatistics and Statistical Bioinformatics
Biostatistics and Statistical BioinformaticsSetia Pramana
 
Cadth 2015 d5 symposium 2015 endonodal trials - version 2
Cadth 2015 d5 symposium 2015   endonodal trials - version 2Cadth 2015 d5 symposium 2015   endonodal trials - version 2
Cadth 2015 d5 symposium 2015 endonodal trials - version 2CADTH Symposium
 
Immune Monitoring
Immune MonitoringImmune Monitoring
Immune MonitoringPamoja
 
Dr. Stephanie Rossow - Applications of Next Generation Sequencing
Dr. Stephanie Rossow - Applications of Next Generation SequencingDr. Stephanie Rossow - Applications of Next Generation Sequencing
Dr. Stephanie Rossow - Applications of Next Generation SequencingJohn Blue
 
Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...
Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...
Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...Shaista Jabeen
 
Pinning control of disease networks
Pinning control of disease networksPinning control of disease networks
Pinning control of disease networksEben du Toit
 
Ryblov - Presentation (ppt)
Ryblov - Presentation (ppt)Ryblov - Presentation (ppt)
Ryblov - Presentation (ppt)Artem Ryblov
 
Epcm l9(new) screening for diseases
Epcm l9(new) screening for diseasesEpcm l9(new) screening for diseases
Epcm l9(new) screening for diseasesDr Ghaiath Hussein
 
Amia tb-review-13
Amia tb-review-13Amia tb-review-13
Amia tb-review-13Russ Altman
 
Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...
Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...
Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...Tom Connor
 
07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...
07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...
07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...UC San Diego AntiViral Research Center
 
Bioclonetics summary presentation~july 2016
Bioclonetics summary presentation~july 2016Bioclonetics summary presentation~july 2016
Bioclonetics summary presentation~july 2016Charles S. Cotropia
 

Semelhante a Health data science.pptx (20)

STDS- recent diagnosis methods@1223.pptx
STDS- recent diagnosis methods@1223.pptxSTDS- recent diagnosis methods@1223.pptx
STDS- recent diagnosis methods@1223.pptx
 
Laboratory monitoring of Progression of HIV
Laboratory monitoring of  Progression of HIVLaboratory monitoring of  Progression of HIV
Laboratory monitoring of Progression of HIV
 
Effect of healthy diet on covid-19
Effect of healthy diet on covid-19Effect of healthy diet on covid-19
Effect of healthy diet on covid-19
 
Development of monoclonal antibodies Workshop
Development of monoclonal antibodies WorkshopDevelopment of monoclonal antibodies Workshop
Development of monoclonal antibodies Workshop
 
Biostatistics and Statistical Bioinformatics
Biostatistics and Statistical BioinformaticsBiostatistics and Statistical Bioinformatics
Biostatistics and Statistical Bioinformatics
 
Cadth 2015 d5 symposium 2015 endonodal trials - version 2
Cadth 2015 d5 symposium 2015   endonodal trials - version 2Cadth 2015 d5 symposium 2015   endonodal trials - version 2
Cadth 2015 d5 symposium 2015 endonodal trials - version 2
 
Immune Monitoring
Immune MonitoringImmune Monitoring
Immune Monitoring
 
HIV MANAGEMENT
HIV MANAGEMENT HIV MANAGEMENT
HIV MANAGEMENT
 
Dr. Stephanie Rossow - Applications of Next Generation Sequencing
Dr. Stephanie Rossow - Applications of Next Generation SequencingDr. Stephanie Rossow - Applications of Next Generation Sequencing
Dr. Stephanie Rossow - Applications of Next Generation Sequencing
 
Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...
Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...
Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...
 
Pinning control of disease networks
Pinning control of disease networksPinning control of disease networks
Pinning control of disease networks
 
Ryblov - Presentation (ppt)
Ryblov - Presentation (ppt)Ryblov - Presentation (ppt)
Ryblov - Presentation (ppt)
 
Lab diagnosis hiv
Lab diagnosis hivLab diagnosis hiv
Lab diagnosis hiv
 
Epcm l9(new) screening for diseases
Epcm l9(new) screening for diseasesEpcm l9(new) screening for diseases
Epcm l9(new) screening for diseases
 
Amia tb-review-13
Amia tb-review-13Amia tb-review-13
Amia tb-review-13
 
Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...
Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...
Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...
 
Incidence Testing in HIV
Incidence Testing in HIVIncidence Testing in HIV
Incidence Testing in HIV
 
07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...
07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...
07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...
 
Bioclonetics summary presentation~july 2016
Bioclonetics summary presentation~july 2016Bioclonetics summary presentation~july 2016
Bioclonetics summary presentation~july 2016
 
WHO global RSV surveillance schema for future planning. Moving from RSV detec...
WHO global RSV surveillance schema for future planning. Moving from RSV detec...WHO global RSV surveillance schema for future planning. Moving from RSV detec...
WHO global RSV surveillance schema for future planning. Moving from RSV detec...
 

Último

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 

Último (20)

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 

Health data science.pptx

  • 2. Why study data science?
  • 3. Why study data science?
  • 4. What is health data science? • Data-driven solution to solve complex real world health problems • Or to derive knowledge from unstructured and messy data • It is an interdisciplinary field: biostatistics, computer science, epidemiology, public health, mathematics, etc
  • 6. Real life health data science example • HIV: • Visualising the pattern of early HIV transmission within the mucosal barrier • COVID-19: • What can predict covid-19 neutralisation activity? • Can we predict covid-19 vaccine efficacy?
  • 8. Background • Early HIV transmission event might occur during vaginal or anal sex • Want to investigate if the mucosal barrier (within the vaginal tissue) is effective in blocking HIV virus transmission or not
  • 9. If the mucosal barrier is good in preventing viral transmission, this is what we expect to see
  • 10. If the mucosal barrier is not good at preventing transmission, multiple viruses can be found (random infection)
  • 11. If the mucosal barrier is not good at preventing transmission, multiple viruses can be found (clustered infection)
  • 13. Data
  • 14. 14 Data Visualisation Can still see many viral variants no evidence that the vaginal tissue is effective in blocking viral entry
  • 15. Need a formal method • How can we say (formally) if infection is spatially clustered (or not) ? • Mantel test (or Mantel and Valand) -> relate a matrix of “geographical” distance and a matrix of “biological” distance • So, need to define the “geographical” matrix and “biological” matrix first 15
  • 16. “Geographical” distance • Euclidean distance di, j = (xi - xj )2 +(yi - yj )2 16
  • 17. “Biological” distance • Morisita – Horn index of overlap MH = 2 n1in2i N1N2 i å n1i 2 N1 + n2i 2 N2 i å 17
  • 18. “Biological” distance • Similarity between 1 and 2 = 0.98 • Similarity between 1 and 3 = 0.46 18
  • 19. Mantel Test (or Mantel and Valand) • Testing the association between two matrices • Mantel quantity (Zm) is given by: • Basic idea -> permutation test • Randomly changing the rows and columns of the two matrices • And store the value of Zm for each permutation of rows and columns Zm = gij j å i å bij 19
  • 20. 20 Low p-values: infection is clustered locally within the vaginal tissue
  • 21. What can predict covid-19 viral neutralisation activity?
  • 22. Background • Neutralising antibody (NAb): antibody that can defend the host from the specific pathogen • Data: 41 convalescent adults; measured several immunological parameters (13 parameters total) • Goal: want to know in those 41 recovered patients, what immunological parameters can be used to predict NAb
  • 23. Methods • Data visualisation is very important in data science • First step: plot the correlation matrix for the whole dataset
  • 24. Microneutralization is positively correlated with SARS-CoV-2 RBD Microneutralization is negatively correlated with CCR6+CXCR3-
  • 25. Ok, not very informative…. Have so many things correlated with microneutralization
  • 26. Methods • Correlation matrix shows that Nab is correlated with so many things • Next step: Can I find some hidden features in this dataset? • Method: principal component analysis (PCA)
  • 27. The main focus is microneutralization If the angle between microneut and another variable is less than 90o; then it’s a positive association If the angle between microneut and another variable is greater than 90o; then it’s a negative association
  • 28. For instance, higher ELISA S trimer gives higher microneutralization level (less than 90o) For instance, higher CCR6+CXCR3- gives lower microneutralization level (more than 90o)
  • 29. Methods • PCA visualisation is better than correlation matrix • But, still cannot just pick one thing that can be used to predict NAb • Next step: I want to only pick one thing to predict NAb • Method: multiple linear regression with a backward model selection strategy • The idea is to run a linear regression with all the variables, and iteratively remove non-significant predictor until all the predictors are significant
  • 30. Two main things are highly predictive of NAb
  • 33. Background • At the end of the phase 2 trial, we get the immunogenicity data (measuring the amount of antibody) • Given the data from phase 2 trial (antibody data), can we predict what the efficacy of the vaccine will be? • Training dataset: efficacy and antibody data from all available vaccines
  • 34. Methods • The first step is always to visualise your data, so why don’t we plot efficacy against antibody first?
  • 35. High antibody = high efficacy Low antibody = low efficacy Can we simply do a classification method based on the level of antibody?
  • 36. Methods • The model is a distribution-free binary classification model, based on the threshold level of antibody • The lower your antibody level, higher chance for you to be infected, so the vaccine efficacy will be lower • The higher your antibody level, lower chance for you to be infected, so the vaccine efficacy will be higher • We want to know what is this threshold of antibody
  • 37. We normalised the antibody to the convalescent patients (the mean for convalescent is one) Covaxin data came out a bit later, so we used covaxin to validate our ‘classifier’ model Using our classifier, as long as we have antibody data (from phase 2 trial), we can predict any vaccine efficacy
  • 38. CureVac mRNA vaccine failure – why???
  • 39. Simple data visualisation can help to answer Because lower dose than Pfizer and Moderna