SlideShare uma empresa Scribd logo
1 de 19
Disease Prediction
System
for
Rural
Health Service
PRESENTED BY
KOYEL MAJUMDAR
REGISTRATION NO:1011621400088 OF 2019-21
ROLL NO:22332006
RINA PAUL
REGISTRATION NO:2014009633 OF 2019-21
ROLL NO:22332010
GUIDED BY
DR. KUNAL DAS
Assistant Professor, Dept. of Computer Science
Acharya Prafulla Chandra College
Motivation
Less availability of Health centre
Less availability of doctors
Distance of health centre
Recognize the disease from symptoms
Less availability of medicine
Point of care facility
Objectives
Our main aim is to provide a quick medical diagnosis
to the patients living in rural areas.
Nowadays, it is very useful for postcovid contactless
system in rural health service.
The goal is to provide access to medical specialists.
This system enhance quality of health care.
Introduction
Any technology user today has benefitted from machine learning.
Machine learning is a subfield of artificial Intelligence. The goal of
machine learning generally is to understand the structure of data and fit
that data into models that can be understood and utilized by people.
Machine learning is used across many spheres around the world. The
health care industry is not an exception.
Machine learning is used in Medical diagnosis, Stock market trading,
Image recognition, Self driving cars and more.
Such information, if predicted well in advance, can provide important
insights to doctors who can then adapt their diagnosis and treatment
per patient basis.
Related Work
Shahadat Uddin, Arif Khan, Md Ekramul Hossain and Mohammand
Ali Moni Worked on the comparing different supervised machine
learning algorithms for disease prediction. Here they selected LR,
SVM, Decision tree, Random forest, Naïve Bayes, K-nearest
neighbour and ANN.
Reference:
s12911-019-1004-8.pdf
https://link.springer.com/article/10.1186/s12911-019-1004-8
Data set features:
1. Name 2. Age 3. Sex 4. Body Mass Index 5. BP
Design of the system
Machine Learning and it’s Algorithm
Machine Learning uses programmed algorithms that learn and
optimise their operations by analysing input data to make predictions
within an acceptable range. Machine learning algorithms they can be
divided into three broad categories according to their purposes. These
three categories are: supervised, unsupervised and semi-supervised.
In supervised machine learning algorithms, a labelled training dataset
is used first to train the underlying algorithm. This trained algorithm is
then fed on the unlabelled test dataset to categorise them into similar
groups.
For disease prediction the learning model include Linear Regression,
Support Vector Machine, Decision Tree, Random Forest, Naïve Bayes,
K-nearest neighbour, Artificial Neural Network.
Algorithm Disease Name
SVM Diabetes, Breast Cancer, Heart Disease, Hypertension,
Parkinson’s Disease, Prostate Cancer
RF Breast Cancer, Diabetes, Heart Disease, Hemoglobin
variants, Lung Cancer, micro RNA
DT Breast Cancer, Cerebral infarction, Hemoglobin variants,
Heart Disease, Kidney Disease
NB Asthma, Heart Disease, Prostate Cancer
ANN Breast Cancer, Heart Disease, Stroke
KNN Heart Disease, Parkinson’s Disease
LR Heart Disease, Liver Disease
From previous referenced paper we show that this seven algorithms give the
best result for following types of predicted diseases.
Expected Outcome
•Experiments are conducted in order to evaluate the performance
of the proposed system.
•For the given symptoms using seven types of supervised machine
learning algorithm, the system predicts any kind of diseases. Then
the predicted diseases list forward to the doctor. At last doctor
choose the correct disease from forwarded list. And the doctor
suggest for medicine.
Future Planning
In future we will try to implement this system which prescribe
medicine from the past history.
We will arrange the video conferencing for talking to doctors at any
time of the day, wherever we are.
Thank You
Shortage of doctors in India
• India is ill-prepared to contain widespread Covid-19
transmission in the community because of a huge shortage of
doctors, health workers and hospital beds, especially in rural areas
and densely populated underserved states.
•For people living in rural areas completely dependent on
government hospitals and clinics, the government allopathic
doctor-patient ratio is 1:10,926, according to the National Health
Profile 2019.
•“The availability of physicians and nurses varies widely across the
country, with the central, northern, eastern, and northeastern
states being poorly served. Rural areas have an especially severe
shortage of qualified health professionals,” added Dr Srinath
Reddy, president, Public Health Foundation of India.
Appendix
Algorithm
1. Support Vector Machine: Support vector machine (SVM) algorithm
can classify both linear and non-linear data. It first maps each data
item into an n-dimensional feature space where n is the number of
features. It then identifies the hyperplane that separates the data
items into two classes while maximising the marginal distance for both
classes and minimising the classification errors.
2. Logistic Regression: Logistic regression (LR) is a powerful and well
established method for supervised classification . It can be considered
as an extension of ordinary regression and can model only a
dichotomous variable which usually represents the occurrence or
nonoccurrence of an event. LR helps in finding the probability that a
new instance belongs to a certain class. Since it is a probability, the
outcome lies between 0 and 1.
3. Decision Tree: Decision tree (DT) is one of the earliest and
prominent machine learning algorithms. A decision tree models the
decision logics i.e., tests and corresponds outcomes for classifying
data items into a tree-like structure. The nodes of a DT tree normally
have multiple levels where the first or top-most node is called the
root node. All internal nodes (i.e., nodes having at least one child)
represent tests on input variables or attributes. Depending on the test
outcome, the classification algorithm branches towards the
appropriate child node where the process of test and branching
repeats until it reaches the leaf node . The leaf or terminal nodes
correspond to the decision outcomes. DTs have been found easy to
interpret and quick to learn, and are a common component to
many medical diagnostic protocols.
4. Naïve Bayes: Naïve Bayes (NB) is a classification technique based
on the Bayes’ theorem .This theorem can describe the probability of
an event based on the prior knowledge of conditions related to that
event. This classifier assumes that a particular feature in a class is not
directly related to any other feature although features for that class
could have interdependence among themselves.
5. Random Forest: A random forest (RF) is an ensemble classifier and
consisting of many DTs similar to the way a forest is a collection of
many trees .DTs that are grown very deep often cause overfitting of
the training data, resulting a high variation in classification outcome
for a small change in the input data. They are very sensitive to their
training data, which makes them error-prone to the test dataset. The
different DTs of an RF are trained using the different parts of
the training dataset. To classify a new sample, the input vector of
that sample is required to pass down with each DT of the forest. Each
DT then considers a different part of that input vector and gives a
classification outcome.
6. Artificial Neural Network: Artificial neural networks (ANNs) are a
set of machine learning algorithms which are inspired by the
functioning of the neural networks of human brain. They were first
proposed by McCulloch and Pitts and later popularised by the works
of Rumelhart et al. in the 1980s. ANN algorithms can be represented
as an interconnected group of nodes. The output of one node goes as
input to another node for subsequent processing according to the
interconnection. Nodes are normally grouped into a matrix called
layer depending on the transformation they perform.
7. K-nearest neighbour: The K-nearest neighbour (KNN) algorithm is
one of the simplest and earliest classification algorithms. It can
be thought a simpler version of an NB classifier. Unlike the NB
technique, the KNN algorithm does not require to consider
probability values. The ‘K’ is the KNN algorithm is the number of
nearest neighbours considered to take ‘vote’ from. The selection of
different values for ‘K’ can generate different classification results for
the same sample object.
Computational Complexity of Algorithms
The train time complexity of KNN= O(knd) where k=no of
neighbors, n=no of training examples and d=no of dimensions of
data
Space complexity=O(nd)
Train time complexity of LR=O(nd)
Space complexity=O(d)
Training time complexity of SVM=O(n^2)
Run time complexity=O(k*d) where k=no of support vector
Training time complexity of DT=O(n*log(n)*d) where n=no of
points in the training set
Run time complexity=O(maximum depth of tree)
Training time complexity of RF=O(n*log(n)*d*k)
Run time complexity=O(depth of tree *k)
Space complexity=O(depth of tree *k)
Training time complexity of NB=O(n*d)
Run time complexity=O(c*d) where c=no. of classes
Complexity of ANN:
since output functions of neurons are in general very simple to
compute i would assume those costs to be constant per neuron.
Given a network with n neurons, this step would be in O(n).
Given the fact, that the number of neurons n for a given problem
can be regarded as a constant, the overall complexity of O(n^2)
equals O(1).
Reference:
https://paritoshkumar-5426.medium.com/time-complexity-of-
ml-models-4ec39fad2770

Mais conteúdo relacionado

Mais procurados

Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
butest
 
Survey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSurvey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease prediction
Sivagowry Shathesh
 
Prediction of heart disease using machine learning.pptx
Prediction of heart disease using machine learning.pptxPrediction of heart disease using machine learning.pptx
Prediction of heart disease using machine learning.pptx
kumari36
 
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Universitat Politècnica de Catalunya
 

Mais procurados (20)

Heart Disease Identification Method Using Machine Learnin in E-healthcare.
Heart Disease Identification Method Using Machine Learnin in E-healthcare.Heart Disease Identification Method Using Machine Learnin in E-healthcare.
Heart Disease Identification Method Using Machine Learnin in E-healthcare.
 
Heart disease prediction system
Heart disease prediction systemHeart disease prediction system
Heart disease prediction system
 
Heart Attack Prediction using Machine Learning
Heart Attack Prediction using Machine LearningHeart Attack Prediction using Machine Learning
Heart Attack Prediction using Machine Learning
 
Breast cancer diagnosis machine learning ppt
Breast cancer diagnosis machine learning pptBreast cancer diagnosis machine learning ppt
Breast cancer diagnosis machine learning ppt
 
Machine learning ppt
Machine learning pptMachine learning ppt
Machine learning ppt
 
Support Vector Machines ( SVM )
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
 
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHM
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHMHEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHM
HEART DISEASE PREDICTION USING NAIVE BAYES ALGORITHM
 
Multiple Disease Prediction System
Multiple Disease Prediction SystemMultiple Disease Prediction System
Multiple Disease Prediction System
 
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
Prediction of Heart Disease using Machine Learning Algorithms: A SurveyPrediction of Heart Disease using Machine Learning Algorithms: A Survey
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
 
Survey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSurvey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease prediction
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learning
 
Prediction of heart disease using machine learning.pptx
Prediction of heart disease using machine learning.pptxPrediction of heart disease using machine learning.pptx
Prediction of heart disease using machine learning.pptx
 
Deep learning and Healthcare
Deep learning and HealthcareDeep learning and Healthcare
Deep learning and Healthcare
 
Machine Learning Using Python
Machine Learning Using PythonMachine Learning Using Python
Machine Learning Using Python
 
Brain tumor detection using image segmentation ppt
Brain tumor detection using image segmentation pptBrain tumor detection using image segmentation ppt
Brain tumor detection using image segmentation ppt
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Diabetes prediction using machine learning
Diabetes prediction using machine learningDiabetes prediction using machine learning
Diabetes prediction using machine learning
 
Predicting Diabetes Using Machine Learning
Predicting Diabetes Using Machine LearningPredicting Diabetes Using Machine Learning
Predicting Diabetes Using Machine Learning
 
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
 

Semelhante a Project on disease prediction

Prediction of Neurological Disorder using Classification Approach
Prediction of Neurological Disorder using Classification ApproachPrediction of Neurological Disorder using Classification Approach
Prediction of Neurological Disorder using Classification Approach
BRNSSPublicationHubI
 
An approach for breast cancer diagnosis classification using neural network
An approach for breast cancer diagnosis classification using neural networkAn approach for breast cancer diagnosis classification using neural network
An approach for breast cancer diagnosis classification using neural network
acijjournal
 
Prognosis of Cardiac Disease using Data Mining Techniques A Comprehensive Survey
Prognosis of Cardiac Disease using Data Mining Techniques A Comprehensive SurveyPrognosis of Cardiac Disease using Data Mining Techniques A Comprehensive Survey
Prognosis of Cardiac Disease using Data Mining Techniques A Comprehensive Survey
ijtsrd
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 

Semelhante a Project on disease prediction (20)

Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
 
Prediction of Neurological Disorder using Classification Approach
Prediction of Neurological Disorder using Classification ApproachPrediction of Neurological Disorder using Classification Approach
Prediction of Neurological Disorder using Classification Approach
 
IRJET- Cancer Disease Prediction using Machine Learning over Big Data
IRJET- Cancer Disease Prediction using Machine Learning over Big DataIRJET- Cancer Disease Prediction using Machine Learning over Big Data
IRJET- Cancer Disease Prediction using Machine Learning over Big Data
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MINING
 
An Artificial Neural Network Model for Neonatal Disease Diagnosis
An Artificial Neural Network Model for Neonatal Disease DiagnosisAn Artificial Neural Network Model for Neonatal Disease Diagnosis
An Artificial Neural Network Model for Neonatal Disease Diagnosis
 
G124549
G124549G124549
G124549
 
Health Care Application using Machine Learning and Deep Learning
Health Care Application using Machine Learning and Deep LearningHealth Care Application using Machine Learning and Deep Learning
Health Care Application using Machine Learning and Deep Learning
 
An approach for breast cancer diagnosis classification using neural network
An approach for breast cancer diagnosis classification using neural networkAn approach for breast cancer diagnosis classification using neural network
An approach for breast cancer diagnosis classification using neural network
 
Prognosis of Cardiac Disease using Data Mining Techniques A Comprehensive Survey
Prognosis of Cardiac Disease using Data Mining Techniques A Comprehensive SurveyPrognosis of Cardiac Disease using Data Mining Techniques A Comprehensive Survey
Prognosis of Cardiac Disease using Data Mining Techniques A Comprehensive Survey
 
IRJET- Disease Prediction using Machine Learning
IRJET-  Disease Prediction using Machine LearningIRJET-  Disease Prediction using Machine Learning
IRJET- Disease Prediction using Machine Learning
 
A MODIFIED MAXIMUM RELEVANCE MINIMUM REDUNDANCY FEATURE SELECTION METHOD BASE...
A MODIFIED MAXIMUM RELEVANCE MINIMUM REDUNDANCY FEATURE SELECTION METHOD BASE...A MODIFIED MAXIMUM RELEVANCE MINIMUM REDUNDANCY FEATURE SELECTION METHOD BASE...
A MODIFIED MAXIMUM RELEVANCE MINIMUM REDUNDANCY FEATURE SELECTION METHOD BASE...
 
A MODIFIED MAXIMUM RELEVANCE MINIMUM REDUNDANCY FEATURE SELECTION METHOD BASE...
A MODIFIED MAXIMUM RELEVANCE MINIMUM REDUNDANCY FEATURE SELECTION METHOD BASE...A MODIFIED MAXIMUM RELEVANCE MINIMUM REDUNDANCY FEATURE SELECTION METHOD BASE...
A MODIFIED MAXIMUM RELEVANCE MINIMUM REDUNDANCY FEATURE SELECTION METHOD BASE...
 
A MODIFIED MAXIMUM RELEVANCE MINIMUM REDUNDANCY FEATURE SELECTION METHOD BASE...
A MODIFIED MAXIMUM RELEVANCE MINIMUM REDUNDANCY FEATURE SELECTION METHOD BASE...A MODIFIED MAXIMUM RELEVANCE MINIMUM REDUNDANCY FEATURE SELECTION METHOD BASE...
A MODIFIED MAXIMUM RELEVANCE MINIMUM REDUNDANCY FEATURE SELECTION METHOD BASE...
 
A Heart Disease Prediction Model using Logistic Regression
A Heart Disease Prediction Model using Logistic RegressionA Heart Disease Prediction Model using Logistic Regression
A Heart Disease Prediction Model using Logistic Regression
 
Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...
Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...
Utilizing Machine Learning, Detect Chronic Kidney Disease and Suggest A Healt...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Hybrid CNN and LSTM Network For Heart Disease Prediction
Hybrid CNN and LSTM Network For Heart Disease PredictionHybrid CNN and LSTM Network For Heart Disease Prediction
Hybrid CNN and LSTM Network For Heart Disease Prediction
 
An ensemble deep learning classifier of entropy convolutional neural network ...
An ensemble deep learning classifier of entropy convolutional neural network ...An ensemble deep learning classifier of entropy convolutional neural network ...
An ensemble deep learning classifier of entropy convolutional neural network ...
 
IRJET- Prediction of Heart Disease using RNN Algorithm
IRJET- Prediction of Heart Disease using RNN AlgorithmIRJET- Prediction of Heart Disease using RNN Algorithm
IRJET- Prediction of Heart Disease using RNN Algorithm
 
IRJET - Survey on Analysis of Breast Cancer Prediction
IRJET - Survey on Analysis of Breast Cancer PredictionIRJET - Survey on Analysis of Breast Cancer Prediction
IRJET - Survey on Analysis of Breast Cancer Prediction
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Project on disease prediction

  • 2. PRESENTED BY KOYEL MAJUMDAR REGISTRATION NO:1011621400088 OF 2019-21 ROLL NO:22332006 RINA PAUL REGISTRATION NO:2014009633 OF 2019-21 ROLL NO:22332010 GUIDED BY DR. KUNAL DAS Assistant Professor, Dept. of Computer Science Acharya Prafulla Chandra College
  • 3. Motivation Less availability of Health centre Less availability of doctors Distance of health centre Recognize the disease from symptoms Less availability of medicine Point of care facility
  • 4. Objectives Our main aim is to provide a quick medical diagnosis to the patients living in rural areas. Nowadays, it is very useful for postcovid contactless system in rural health service. The goal is to provide access to medical specialists. This system enhance quality of health care.
  • 5. Introduction Any technology user today has benefitted from machine learning. Machine learning is a subfield of artificial Intelligence. The goal of machine learning generally is to understand the structure of data and fit that data into models that can be understood and utilized by people. Machine learning is used across many spheres around the world. The health care industry is not an exception. Machine learning is used in Medical diagnosis, Stock market trading, Image recognition, Self driving cars and more. Such information, if predicted well in advance, can provide important insights to doctors who can then adapt their diagnosis and treatment per patient basis.
  • 6. Related Work Shahadat Uddin, Arif Khan, Md Ekramul Hossain and Mohammand Ali Moni Worked on the comparing different supervised machine learning algorithms for disease prediction. Here they selected LR, SVM, Decision tree, Random forest, Naïve Bayes, K-nearest neighbour and ANN. Reference: s12911-019-1004-8.pdf https://link.springer.com/article/10.1186/s12911-019-1004-8 Data set features: 1. Name 2. Age 3. Sex 4. Body Mass Index 5. BP
  • 7. Design of the system
  • 8. Machine Learning and it’s Algorithm Machine Learning uses programmed algorithms that learn and optimise their operations by analysing input data to make predictions within an acceptable range. Machine learning algorithms they can be divided into three broad categories according to their purposes. These three categories are: supervised, unsupervised and semi-supervised. In supervised machine learning algorithms, a labelled training dataset is used first to train the underlying algorithm. This trained algorithm is then fed on the unlabelled test dataset to categorise them into similar groups. For disease prediction the learning model include Linear Regression, Support Vector Machine, Decision Tree, Random Forest, Naïve Bayes, K-nearest neighbour, Artificial Neural Network.
  • 9. Algorithm Disease Name SVM Diabetes, Breast Cancer, Heart Disease, Hypertension, Parkinson’s Disease, Prostate Cancer RF Breast Cancer, Diabetes, Heart Disease, Hemoglobin variants, Lung Cancer, micro RNA DT Breast Cancer, Cerebral infarction, Hemoglobin variants, Heart Disease, Kidney Disease NB Asthma, Heart Disease, Prostate Cancer ANN Breast Cancer, Heart Disease, Stroke KNN Heart Disease, Parkinson’s Disease LR Heart Disease, Liver Disease From previous referenced paper we show that this seven algorithms give the best result for following types of predicted diseases.
  • 10. Expected Outcome •Experiments are conducted in order to evaluate the performance of the proposed system. •For the given symptoms using seven types of supervised machine learning algorithm, the system predicts any kind of diseases. Then the predicted diseases list forward to the doctor. At last doctor choose the correct disease from forwarded list. And the doctor suggest for medicine.
  • 11. Future Planning In future we will try to implement this system which prescribe medicine from the past history. We will arrange the video conferencing for talking to doctors at any time of the day, wherever we are.
  • 13. Shortage of doctors in India • India is ill-prepared to contain widespread Covid-19 transmission in the community because of a huge shortage of doctors, health workers and hospital beds, especially in rural areas and densely populated underserved states. •For people living in rural areas completely dependent on government hospitals and clinics, the government allopathic doctor-patient ratio is 1:10,926, according to the National Health Profile 2019. •“The availability of physicians and nurses varies widely across the country, with the central, northern, eastern, and northeastern states being poorly served. Rural areas have an especially severe shortage of qualified health professionals,” added Dr Srinath Reddy, president, Public Health Foundation of India.
  • 14. Appendix Algorithm 1. Support Vector Machine: Support vector machine (SVM) algorithm can classify both linear and non-linear data. It first maps each data item into an n-dimensional feature space where n is the number of features. It then identifies the hyperplane that separates the data items into two classes while maximising the marginal distance for both classes and minimising the classification errors. 2. Logistic Regression: Logistic regression (LR) is a powerful and well established method for supervised classification . It can be considered as an extension of ordinary regression and can model only a dichotomous variable which usually represents the occurrence or nonoccurrence of an event. LR helps in finding the probability that a new instance belongs to a certain class. Since it is a probability, the outcome lies between 0 and 1.
  • 15. 3. Decision Tree: Decision tree (DT) is one of the earliest and prominent machine learning algorithms. A decision tree models the decision logics i.e., tests and corresponds outcomes for classifying data items into a tree-like structure. The nodes of a DT tree normally have multiple levels where the first or top-most node is called the root node. All internal nodes (i.e., nodes having at least one child) represent tests on input variables or attributes. Depending on the test outcome, the classification algorithm branches towards the appropriate child node where the process of test and branching repeats until it reaches the leaf node . The leaf or terminal nodes correspond to the decision outcomes. DTs have been found easy to interpret and quick to learn, and are a common component to many medical diagnostic protocols.
  • 16. 4. Naïve Bayes: Naïve Bayes (NB) is a classification technique based on the Bayes’ theorem .This theorem can describe the probability of an event based on the prior knowledge of conditions related to that event. This classifier assumes that a particular feature in a class is not directly related to any other feature although features for that class could have interdependence among themselves. 5. Random Forest: A random forest (RF) is an ensemble classifier and consisting of many DTs similar to the way a forest is a collection of many trees .DTs that are grown very deep often cause overfitting of the training data, resulting a high variation in classification outcome for a small change in the input data. They are very sensitive to their training data, which makes them error-prone to the test dataset. The different DTs of an RF are trained using the different parts of the training dataset. To classify a new sample, the input vector of that sample is required to pass down with each DT of the forest. Each DT then considers a different part of that input vector and gives a classification outcome.
  • 17. 6. Artificial Neural Network: Artificial neural networks (ANNs) are a set of machine learning algorithms which are inspired by the functioning of the neural networks of human brain. They were first proposed by McCulloch and Pitts and later popularised by the works of Rumelhart et al. in the 1980s. ANN algorithms can be represented as an interconnected group of nodes. The output of one node goes as input to another node for subsequent processing according to the interconnection. Nodes are normally grouped into a matrix called layer depending on the transformation they perform. 7. K-nearest neighbour: The K-nearest neighbour (KNN) algorithm is one of the simplest and earliest classification algorithms. It can be thought a simpler version of an NB classifier. Unlike the NB technique, the KNN algorithm does not require to consider probability values. The ‘K’ is the KNN algorithm is the number of nearest neighbours considered to take ‘vote’ from. The selection of different values for ‘K’ can generate different classification results for the same sample object.
  • 18. Computational Complexity of Algorithms The train time complexity of KNN= O(knd) where k=no of neighbors, n=no of training examples and d=no of dimensions of data Space complexity=O(nd) Train time complexity of LR=O(nd) Space complexity=O(d) Training time complexity of SVM=O(n^2) Run time complexity=O(k*d) where k=no of support vector Training time complexity of DT=O(n*log(n)*d) where n=no of points in the training set Run time complexity=O(maximum depth of tree)
  • 19. Training time complexity of RF=O(n*log(n)*d*k) Run time complexity=O(depth of tree *k) Space complexity=O(depth of tree *k) Training time complexity of NB=O(n*d) Run time complexity=O(c*d) where c=no. of classes Complexity of ANN: since output functions of neurons are in general very simple to compute i would assume those costs to be constant per neuron. Given a network with n neurons, this step would be in O(n). Given the fact, that the number of neurons n for a given problem can be regarded as a constant, the overall complexity of O(n^2) equals O(1). Reference: https://paritoshkumar-5426.medium.com/time-complexity-of- ml-models-4ec39fad2770