SlideShare uma empresa Scribd logo
1 de 13
LUNG CANCER
RISK PREDICTION
MODELS
Thao Ngo
INTRODUCTION
• Lung Cancer is the number one cause of all cancer deaths in the US, estimated
234,030 new cases and 154,050 deaths in 2018.
• Early detection using low-dose computed tomography (CT) Screening on high risk
individuals can reduce lung cancer mortality by 20%.
• The current CT screening criteria are 55-77 years old adults, currently smoking, and
30 pack-year smoking history, but these simple criteria are relatively ineffective.
• Many researches suggest that using lung cancer risk prediction models could lead
to more effective screening programs compared to the current screening criteria.
• Develop two risk prediction models for Lung Cancer using classification
algorithms in R.
Decision Tree – Classification and Regression Tree ( CART)
Neural Network – Artificial Neural Network (ANN)
• Select the better model base on their performance metrics.
• Identify the major risk factors associated with lung cancer.
PROJECT PURPOSE
Variables Characteristic
Patient ID Character
Age Numeric 14-73
Gender Binary 1-2
Smoking Numeric 1-8
Passive Smoking Numeric 1-8
Air Pollution Numeric 1-8
Occupational Hazards Numeric 1-8
Genetic Risk Numeric 1-7
Alcohol Use Numeric 1-7
Chronic Lung Disease Numeric 1-7
Dust Allergy Numeric 1-7
Diet Balance Numeric 1-7
Chest Pain Numeric 1-9
Short Breath Numeric 1-9
Fatigue Numeric 1-9
Bloody Coughing Numeric 1-9
Wheezing Numeric 1-7
Swallowing Difficulty Numeric 1-7
Clubbing of finger nails Numeric 1-7
Weight Loss Numeric 1-7
Frequent Cold Numeric 1-7
Dry Cough Numeric 1-7
Clubbing of finger nails Numeric 1-9
Levels Chr /Binary High, Medium, Low
DATA
DESCRIPTION
• Data is a subset of the National Lung
Screening Trial Cohort
• 1000 randomized participants
• 22 attributes are potential risk
factors and symptoms of lung
cancer
• Each observation has one of 3
possible classes: Low, Medium, High
DATA PREPARATION
MODELING
Accuracy
• Accuracy = (true positive + true negative) / (positive +
negative)
Sensitivity (True Positive Rate)
• Sensitivity= true positives/(true positive + false negative)
Specificity (True Negative Rate)
• Specificity=true negatives/(true negative + false positives)
Precision (Positive Predictive Value)
• Precision= true positive/( true positive +false
positive)
Receiver Operating Characteristic (ROC) Area
• a model ability to discriminate between positive and
negative classes
PERFORMANCE
METRICS
Decision Tree (CART)
RESULT ANALYSIS
Class Accuracy Sensitivity Specificity Precision ROC area
High .9832 .9541 1 1 .9721
Low .9731 1 0.9615 0.9184 .9342
Medium .9899 .9697 1 1 .9573
RESULT ANALYSIS
Neural Network (ANN)
Class Accuracy Sensitivity Specificity Precision ROC area
High(black) .9899 1 .9841 .9732 .9636
Low(red) .9592 1 8990 .8108 .8894
Medium(green) .9194 .7576 1 1 .9039
MODEL EVALUATION
Models Accuracy Sensitivity Specificity Precision ROC Area
Decision Tree
(High Level)
.9832 .9541 1 1 .9721
Neural Network
(High Level)
.9899 1 .9841 .9732 .9636
DISCUSSION
• In medical test, False Negative is more dangerous than False Positive, so Finale risk prediction model is
Artificial Neural Network model which has 100% Sensitivity (0% False Negative) compared to Decision
Tree 95.41% Sensitivity (4.59% False Negative).
• Based on Variable Importance result, the most significant risk factors for lung cancer are Air Pollution,
Age, Smoking, Passive Smoking, and Alcohol Use.
• Future improvements
Improve the model performance by fine-tuning the model parameters
Reduce input features to prevent overfitting.
Increase data inputs for better model performance.
Use different classification algorithms for better selection ( Support Vector Machine, RandomForest)
• The project has developed the risk prediction model for Lung Cancer and identified top
5 risk factors associated with Lung cancer using classification methods in R packages.
• Using risk prediction models to select high-risk individuals for lung cancer screening
would be more superior to current selection criteria.
• Avoiding the major risk factors may help to prevent and lower lung cancer.
• The project shows that the results are promising for the application of lung cancer risk
prediction models for selective screening.
CONCLUSION
• American Lung Association http://www.lung.org
• National Lung Screening Trials https://www.cancer.gov/types/lung/research/nlst
• Fitting a neural network in R https://www.r-bloggers.com
• Classification And Regression Trees for Machine Learning https://machinelearningmastery.com
• Machine Learning in Medicine, Rahul C. Deo, Circulation. 2015;132:1920-1930, November 16,
2015
• Evaluation of Classification Model Accuracy: Essentials http://www.sthda.com/english/articles
• Cross-Validation for Predictive Analytics using R http://www.milanor.net/blog/cross-validation-
for-predictive-analytics-using-r/
• Ideas on interpreting machine learning Patrick Hall, Wen Phan, SriSatish Ambati,March 15, 2017
• R packages https://cran.r-project.org/web/packages
REFERENCES

Mais conteúdo relacionado

Mais procurados

A Survey on Stroke Prediction
A Survey on Stroke PredictionA Survey on Stroke Prediction
A Survey on Stroke PredictionMohammadRakib8
 
HPPS: Heart Problem Prediction System using Machine Learning
HPPS: Heart Problem Prediction System using Machine LearningHPPS: Heart Problem Prediction System using Machine Learning
HPPS: Heart Problem Prediction System using Machine LearningNimai Chand Das Adhikari
 
Molecular subtypes of breast cancer
Molecular subtypes of breast cancerMolecular subtypes of breast cancer
Molecular subtypes of breast cancerJoydeep Ghosh
 
Android Based Questionnaires Application for Heart Disease Prediction System
Android Based Questionnaires Application for Heart Disease Prediction SystemAndroid Based Questionnaires Application for Heart Disease Prediction System
Android Based Questionnaires Application for Heart Disease Prediction Systemijtsrd
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsKush Kulshrestha
 
Visual Explanation of Ridge Regression and LASSO
Visual Explanation of Ridge Regression and LASSOVisual Explanation of Ridge Regression and LASSO
Visual Explanation of Ridge Regression and LASSOKazuki Yoshida
 
Intelligent computer aided diagnosis system for liver fibrosis
Intelligent computer aided diagnosis system for liver fibrosisIntelligent computer aided diagnosis system for liver fibrosis
Intelligent computer aided diagnosis system for liver fibrosisAboul Ella Hassanien
 
Chronic Kidney Disease Prediction Using Machine Learning
Chronic Kidney Disease Prediction Using Machine LearningChronic Kidney Disease Prediction Using Machine Learning
Chronic Kidney Disease Prediction Using Machine LearningIJCSIS Research Publications
 
Kaplan meier survival curves and the log-rank test
Kaplan meier survival curves and the log-rank testKaplan meier survival curves and the log-rank test
Kaplan meier survival curves and the log-rank testzhe1
 
Image Classification And Support Vector Machine
Image Classification And Support Vector MachineImage Classification And Support Vector Machine
Image Classification And Support Vector MachineShao-Chuan Wang
 
Principles of Medical Oncology
Principles of Medical OncologyPrinciples of Medical Oncology
Principles of Medical OncologyEneutron
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximizationbutest
 
Predict Breast Cancer using Deep Learning
Predict Breast Cancer using Deep LearningPredict Breast Cancer using Deep Learning
Predict Breast Cancer using Deep LearningAyesha Shafique
 
Personalised medicine in rt dr. ashutosh
Personalised medicine in rt   dr. ashutoshPersonalised medicine in rt   dr. ashutosh
Personalised medicine in rt dr. ashutoshAshutosh Mukherji
 
HEREDITARY BREAST and OVARY CANCER [HBOC] SYNDROME, Dr BUI DAC CHI.
HEREDITARY BREAST and OVARY CANCER [HBOC] SYNDROME, Dr BUI DAC CHI.HEREDITARY BREAST and OVARY CANCER [HBOC] SYNDROME, Dr BUI DAC CHI.
HEREDITARY BREAST and OVARY CANCER [HBOC] SYNDROME, Dr BUI DAC CHI.hungnguyenthien
 

Mais procurados (20)

A Survey on Stroke Prediction
A Survey on Stroke PredictionA Survey on Stroke Prediction
A Survey on Stroke Prediction
 
HPPS: Heart Problem Prediction System using Machine Learning
HPPS: Heart Problem Prediction System using Machine LearningHPPS: Heart Problem Prediction System using Machine Learning
HPPS: Heart Problem Prediction System using Machine Learning
 
Molecular subtypes of breast cancer
Molecular subtypes of breast cancerMolecular subtypes of breast cancer
Molecular subtypes of breast cancer
 
Precision Medicine in Oncology
Precision Medicine in OncologyPrecision Medicine in Oncology
Precision Medicine in Oncology
 
Android Based Questionnaires Application for Heart Disease Prediction System
Android Based Questionnaires Application for Heart Disease Prediction SystemAndroid Based Questionnaires Application for Heart Disease Prediction System
Android Based Questionnaires Application for Heart Disease Prediction System
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
 
Radiosurgery For Brain Metastases !
Radiosurgery For Brain Metastases !Radiosurgery For Brain Metastases !
Radiosurgery For Brain Metastases !
 
Visual Explanation of Ridge Regression and LASSO
Visual Explanation of Ridge Regression and LASSOVisual Explanation of Ridge Regression and LASSO
Visual Explanation of Ridge Regression and LASSO
 
Intelligent computer aided diagnosis system for liver fibrosis
Intelligent computer aided diagnosis system for liver fibrosisIntelligent computer aided diagnosis system for liver fibrosis
Intelligent computer aided diagnosis system for liver fibrosis
 
Chronic Kidney Disease Prediction Using Machine Learning
Chronic Kidney Disease Prediction Using Machine LearningChronic Kidney Disease Prediction Using Machine Learning
Chronic Kidney Disease Prediction Using Machine Learning
 
lymphoma response
 lymphoma response lymphoma response
lymphoma response
 
Cancer Associated Thrombosis
Cancer Associated ThrombosisCancer Associated Thrombosis
Cancer Associated Thrombosis
 
Kaplan meier survival curves and the log-rank test
Kaplan meier survival curves and the log-rank testKaplan meier survival curves and the log-rank test
Kaplan meier survival curves and the log-rank test
 
PET-CT in Oncology
PET-CT in OncologyPET-CT in Oncology
PET-CT in Oncology
 
Image Classification And Support Vector Machine
Image Classification And Support Vector MachineImage Classification And Support Vector Machine
Image Classification And Support Vector Machine
 
Principles of Medical Oncology
Principles of Medical OncologyPrinciples of Medical Oncology
Principles of Medical Oncology
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
 
Predict Breast Cancer using Deep Learning
Predict Breast Cancer using Deep LearningPredict Breast Cancer using Deep Learning
Predict Breast Cancer using Deep Learning
 
Personalised medicine in rt dr. ashutosh
Personalised medicine in rt   dr. ashutoshPersonalised medicine in rt   dr. ashutosh
Personalised medicine in rt dr. ashutosh
 
HEREDITARY BREAST and OVARY CANCER [HBOC] SYNDROME, Dr BUI DAC CHI.
HEREDITARY BREAST and OVARY CANCER [HBOC] SYNDROME, Dr BUI DAC CHI.HEREDITARY BREAST and OVARY CANCER [HBOC] SYNDROME, Dr BUI DAC CHI.
HEREDITARY BREAST and OVARY CANCER [HBOC] SYNDROME, Dr BUI DAC CHI.
 

Semelhante a Lung Cancer Risk Prediction Models

IRJET- Intelligent Prediction of Lung Cancer Via MRI Images using Morphologic...
IRJET- Intelligent Prediction of Lung Cancer Via MRI Images using Morphologic...IRJET- Intelligent Prediction of Lung Cancer Via MRI Images using Morphologic...
IRJET- Intelligent Prediction of Lung Cancer Via MRI Images using Morphologic...IRJET Journal
 
Health economic modelling in the diagnostics development process
Health economic modelling in the diagnostics development processHealth economic modelling in the diagnostics development process
Health economic modelling in the diagnostics development processcheweb1
 
Technology Assessment, Outcomes Research and Economic Analyses
Technology Assessment, Outcomes Research and Economic AnalysesTechnology Assessment, Outcomes Research and Economic Analyses
Technology Assessment, Outcomes Research and Economic Analysesevadew1
 
Low Dose CT Screening for Early Diagnosis of Lung Cancer
Low Dose CT Screening for Early Diagnosis of Lung CancerLow Dose CT Screening for Early Diagnosis of Lung Cancer
Low Dose CT Screening for Early Diagnosis of Lung CancerKue Lee
 
Technology Assessment/Outcome & Cost-Effectiveness Analysis 2016
Technology Assessment/Outcome & Cost-Effectiveness Analysis 2016Technology Assessment/Outcome & Cost-Effectiveness Analysis 2016
Technology Assessment/Outcome & Cost-Effectiveness Analysis 2016evadew1
 
YOLOv8-Based Lung Nodule Detection: A Novel Hybrid Deep Learning Model Proposal
YOLOv8-Based Lung Nodule Detection: A Novel Hybrid Deep Learning Model ProposalYOLOv8-Based Lung Nodule Detection: A Novel Hybrid Deep Learning Model Proposal
YOLOv8-Based Lung Nodule Detection: A Novel Hybrid Deep Learning Model ProposalIRJET Journal
 
Detection of Lung Cancer using SVM Classification
Detection of Lung Cancer using SVM ClassificationDetection of Lung Cancer using SVM Classification
Detection of Lung Cancer using SVM ClassificationIRJET Journal
 
IRJET- Survey Paper on Oral Cancer Detection using Machine Learning
IRJET-  	  Survey Paper on Oral Cancer Detection using Machine LearningIRJET-  	  Survey Paper on Oral Cancer Detection using Machine Learning
IRJET- Survey Paper on Oral Cancer Detection using Machine LearningIRJET Journal
 
Journal club lung cancer screening
Journal club lung cancer screeningJournal club lung cancer screening
Journal club lung cancer screeningRanjita Pallavi
 
DataMining Techniques in BreastCancer.pptx
DataMining Techniques in BreastCancer.pptxDataMining Techniques in BreastCancer.pptx
DataMining Techniques in BreastCancer.pptxMaligireddyTanujaRed1
 
randomized clinical trials II
randomized clinical trials IIrandomized clinical trials II
randomized clinical trials IIIAU Dent
 
Oncotype dx
Oncotype dxOncotype dx
Oncotype dxNHS
 
The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...Integrated DNA Technologies
 
Quality Measurement in Cardiac Surgery
Quality Measurement in Cardiac SurgeryQuality Measurement in Cardiac Surgery
Quality Measurement in Cardiac SurgeryNora Albogami
 
Design of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer DiseasesDesign of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer DiseasesMohamed Loey
 
Frederique Penault Llorca : Prosigna : un test décentralisé apporte t il une ...
Frederique Penault Llorca : Prosigna : un test décentralisé apporte t il une ...Frederique Penault Llorca : Prosigna : un test décentralisé apporte t il une ...
Frederique Penault Llorca : Prosigna : un test décentralisé apporte t il une ...breastcancerupdatecongress
 
H2O World - H2O for Genomics with Hussam Al-Deen Ashab
H2O World - H2O for Genomics with Hussam Al-Deen AshabH2O World - H2O for Genomics with Hussam Al-Deen Ashab
H2O World - H2O for Genomics with Hussam Al-Deen AshabSri Ambati
 

Semelhante a Lung Cancer Risk Prediction Models (20)

IRJET- Intelligent Prediction of Lung Cancer Via MRI Images using Morphologic...
IRJET- Intelligent Prediction of Lung Cancer Via MRI Images using Morphologic...IRJET- Intelligent Prediction of Lung Cancer Via MRI Images using Morphologic...
IRJET- Intelligent Prediction of Lung Cancer Via MRI Images using Morphologic...
 
Health economic modelling in the diagnostics development process
Health economic modelling in the diagnostics development processHealth economic modelling in the diagnostics development process
Health economic modelling in the diagnostics development process
 
Technology Assessment, Outcomes Research and Economic Analyses
Technology Assessment, Outcomes Research and Economic AnalysesTechnology Assessment, Outcomes Research and Economic Analyses
Technology Assessment, Outcomes Research and Economic Analyses
 
Low Dose CT Screening for Early Diagnosis of Lung Cancer
Low Dose CT Screening for Early Diagnosis of Lung CancerLow Dose CT Screening for Early Diagnosis of Lung Cancer
Low Dose CT Screening for Early Diagnosis of Lung Cancer
 
Technology Assessment/Outcome & Cost-Effectiveness Analysis 2016
Technology Assessment/Outcome & Cost-Effectiveness Analysis 2016Technology Assessment/Outcome & Cost-Effectiveness Analysis 2016
Technology Assessment/Outcome & Cost-Effectiveness Analysis 2016
 
YOLOv8-Based Lung Nodule Detection: A Novel Hybrid Deep Learning Model Proposal
YOLOv8-Based Lung Nodule Detection: A Novel Hybrid Deep Learning Model ProposalYOLOv8-Based Lung Nodule Detection: A Novel Hybrid Deep Learning Model Proposal
YOLOv8-Based Lung Nodule Detection: A Novel Hybrid Deep Learning Model Proposal
 
16
1616
16
 
Detection of Lung Cancer using SVM Classification
Detection of Lung Cancer using SVM ClassificationDetection of Lung Cancer using SVM Classification
Detection of Lung Cancer using SVM Classification
 
IRJET- Survey Paper on Oral Cancer Detection using Machine Learning
IRJET-  	  Survey Paper on Oral Cancer Detection using Machine LearningIRJET-  	  Survey Paper on Oral Cancer Detection using Machine Learning
IRJET- Survey Paper on Oral Cancer Detection using Machine Learning
 
Journal club lung cancer screening
Journal club lung cancer screeningJournal club lung cancer screening
Journal club lung cancer screening
 
Comparison of breast cancer classification models on Wisconsin dataset
Comparison of breast cancer classification models on Wisconsin  datasetComparison of breast cancer classification models on Wisconsin  dataset
Comparison of breast cancer classification models on Wisconsin dataset
 
DataMining Techniques in BreastCancer.pptx
DataMining Techniques in BreastCancer.pptxDataMining Techniques in BreastCancer.pptx
DataMining Techniques in BreastCancer.pptx
 
randomized clinical trials II
randomized clinical trials IIrandomized clinical trials II
randomized clinical trials II
 
Oncotype dx
Oncotype dxOncotype dx
Oncotype dx
 
The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...
 
Quality Measurement in Cardiac Surgery
Quality Measurement in Cardiac SurgeryQuality Measurement in Cardiac Surgery
Quality Measurement in Cardiac Surgery
 
Design of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer DiseasesDesign of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer Diseases
 
AI in Gynaec Onco
AI in Gynaec OncoAI in Gynaec Onco
AI in Gynaec Onco
 
Frederique Penault Llorca : Prosigna : un test décentralisé apporte t il une ...
Frederique Penault Llorca : Prosigna : un test décentralisé apporte t il une ...Frederique Penault Llorca : Prosigna : un test décentralisé apporte t il une ...
Frederique Penault Llorca : Prosigna : un test décentralisé apporte t il une ...
 
H2O World - H2O for Genomics with Hussam Al-Deen Ashab
H2O World - H2O for Genomics with Hussam Al-Deen AshabH2O World - H2O for Genomics with Hussam Al-Deen Ashab
H2O World - H2O for Genomics with Hussam Al-Deen Ashab
 

Último

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Último (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Lung Cancer Risk Prediction Models

  • 2. INTRODUCTION • Lung Cancer is the number one cause of all cancer deaths in the US, estimated 234,030 new cases and 154,050 deaths in 2018. • Early detection using low-dose computed tomography (CT) Screening on high risk individuals can reduce lung cancer mortality by 20%. • The current CT screening criteria are 55-77 years old adults, currently smoking, and 30 pack-year smoking history, but these simple criteria are relatively ineffective. • Many researches suggest that using lung cancer risk prediction models could lead to more effective screening programs compared to the current screening criteria.
  • 3. • Develop two risk prediction models for Lung Cancer using classification algorithms in R. Decision Tree – Classification and Regression Tree ( CART) Neural Network – Artificial Neural Network (ANN) • Select the better model base on their performance metrics. • Identify the major risk factors associated with lung cancer. PROJECT PURPOSE
  • 4. Variables Characteristic Patient ID Character Age Numeric 14-73 Gender Binary 1-2 Smoking Numeric 1-8 Passive Smoking Numeric 1-8 Air Pollution Numeric 1-8 Occupational Hazards Numeric 1-8 Genetic Risk Numeric 1-7 Alcohol Use Numeric 1-7 Chronic Lung Disease Numeric 1-7 Dust Allergy Numeric 1-7 Diet Balance Numeric 1-7 Chest Pain Numeric 1-9 Short Breath Numeric 1-9 Fatigue Numeric 1-9 Bloody Coughing Numeric 1-9 Wheezing Numeric 1-7 Swallowing Difficulty Numeric 1-7 Clubbing of finger nails Numeric 1-7 Weight Loss Numeric 1-7 Frequent Cold Numeric 1-7 Dry Cough Numeric 1-7 Clubbing of finger nails Numeric 1-9 Levels Chr /Binary High, Medium, Low DATA DESCRIPTION • Data is a subset of the National Lung Screening Trial Cohort • 1000 randomized participants • 22 attributes are potential risk factors and symptoms of lung cancer • Each observation has one of 3 possible classes: Low, Medium, High
  • 7. Accuracy • Accuracy = (true positive + true negative) / (positive + negative) Sensitivity (True Positive Rate) • Sensitivity= true positives/(true positive + false negative) Specificity (True Negative Rate) • Specificity=true negatives/(true negative + false positives) Precision (Positive Predictive Value) • Precision= true positive/( true positive +false positive) Receiver Operating Characteristic (ROC) Area • a model ability to discriminate between positive and negative classes PERFORMANCE METRICS
  • 8. Decision Tree (CART) RESULT ANALYSIS Class Accuracy Sensitivity Specificity Precision ROC area High .9832 .9541 1 1 .9721 Low .9731 1 0.9615 0.9184 .9342 Medium .9899 .9697 1 1 .9573
  • 9. RESULT ANALYSIS Neural Network (ANN) Class Accuracy Sensitivity Specificity Precision ROC area High(black) .9899 1 .9841 .9732 .9636 Low(red) .9592 1 8990 .8108 .8894 Medium(green) .9194 .7576 1 1 .9039
  • 10. MODEL EVALUATION Models Accuracy Sensitivity Specificity Precision ROC Area Decision Tree (High Level) .9832 .9541 1 1 .9721 Neural Network (High Level) .9899 1 .9841 .9732 .9636
  • 11. DISCUSSION • In medical test, False Negative is more dangerous than False Positive, so Finale risk prediction model is Artificial Neural Network model which has 100% Sensitivity (0% False Negative) compared to Decision Tree 95.41% Sensitivity (4.59% False Negative). • Based on Variable Importance result, the most significant risk factors for lung cancer are Air Pollution, Age, Smoking, Passive Smoking, and Alcohol Use. • Future improvements Improve the model performance by fine-tuning the model parameters Reduce input features to prevent overfitting. Increase data inputs for better model performance. Use different classification algorithms for better selection ( Support Vector Machine, RandomForest)
  • 12. • The project has developed the risk prediction model for Lung Cancer and identified top 5 risk factors associated with Lung cancer using classification methods in R packages. • Using risk prediction models to select high-risk individuals for lung cancer screening would be more superior to current selection criteria. • Avoiding the major risk factors may help to prevent and lower lung cancer. • The project shows that the results are promising for the application of lung cancer risk prediction models for selective screening. CONCLUSION
  • 13. • American Lung Association http://www.lung.org • National Lung Screening Trials https://www.cancer.gov/types/lung/research/nlst • Fitting a neural network in R https://www.r-bloggers.com • Classification And Regression Trees for Machine Learning https://machinelearningmastery.com • Machine Learning in Medicine, Rahul C. Deo, Circulation. 2015;132:1920-1930, November 16, 2015 • Evaluation of Classification Model Accuracy: Essentials http://www.sthda.com/english/articles • Cross-Validation for Predictive Analytics using R http://www.milanor.net/blog/cross-validation- for-predictive-analytics-using-r/ • Ideas on interpreting machine learning Patrick Hall, Wen Phan, SriSatish Ambati,March 15, 2017 • R packages https://cran.r-project.org/web/packages REFERENCES