SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
Sawinder Pal Kaur, PhD
Kaggle Projects
Outline
 Problem
 Statement
 Methods used
 Results
Problem: Digit Recognizer
 Identify handwritten single digits 0~9, based
on grey scale images.
Sample images
Statement
Each image is 28 pixels in height and 28 pixels in width, for a
total of 784 pixels in total. Each pixel has a single pixel-
value associated with it, indicating the lightness or darkness
of that pixel, with higher numbers meaning darker. This
pixel-value is an integer between 0 and 255, inclusive.
pixel0 pixel1 pixel2 ... pixel27
pixel28 pixel29 pixel30 ... pixel55
| | | ... |
pixel756 pixel757 pixel758 ... pixel783
Statement
 The training data set, has 785 columns. The first
column, called "label", is the digit that was drawn by the
user. The rest of the columns contain the pixel-values of
the associated image.
 The test data set, is the same as the training set, except
that it does not contain the "label" column.
 Goal of the problem is to predict the images in the test
data set
Methods used to solve the
problem
 Random Forest
 Support Vector Machine (SVM)
 K-Nearest Neighborhood (KNN)
Random Forest
 Ensemble of decision trees
 Each tree is trained on a bootstrapped sample of the
original data set
 Each time a node is split, only a randomly chosen subset
of the dimensions are considered for splitting
 Each tree is fully grown and not pruned
 When a new input is entered into the system, it is run down
all of the trees. The result may either be an average or
weighted average of all of the terminal nodes that are
reached, or, in the case of categorical variables, a voting
majority
Random Forest
Support Vector Machine
 In a SVM model original objects (training data) are treated
as a points in the space (input space)
 These are mapped (rearranged) to a new space (feature
space) using mathematical functions called kernels
 After mapping objects of separate categories are divided
by a clear gap as wide as possible
K Nearest Neighborhood
 Basic idea
 If it walks like a duck, quacks like a duck than it is probably a duck
 There are three key elements :
 a set of labeled objects (e.g., a set of stored records)
 a distance or similarity metric to compute distance between objects,
and
 the value of k, the number of nearest neighbors.
 To classify an unlabeled object :
 the distance of this object to the labeled objects is computed,
 its k-nearest neighbors are identified, and
 the class labels of these nearest neighbors are then used to
determine the class label of the object.
Results
 Random Forests with 500 trees gave 97%
accuracy on the test data.
 SVM with RBF kernel and C=1, gave 97.71%
accuracy on the test data.
 KNN with k=10 gave 96% accuracy.
Titanic: Machine Learning
from Disaster
Problem
 The sinking of the RMS Titanic is one of the most
infamous shipwrecks in history.
 One of the reasons that the shipwreck led to such loss
of life was that there were not enough lifeboats for the
passengers and crew. Although there was some
element of luck involved in surviving the sinking, some
groups of people were more likely to survive than
others, such as women, children, and the upper-class.
 In this project, the analysis of what sorts of people
were likely to survive is done. In particular, the tools of
machine learning are applied to predict which
passengers survived the tragedy.
Statement
 The historical data has been split into two
groups, a 'training set' and a 'test set'. For the
training set, the outcome whether or not the
passenger survived the sinking ( 0 for deceased,
1 for survived ) is provided.
 The goal of the problem is to predict the
outcome for each passenger in the test set.
Methods used to solve the
problem
• Random Forest
• Support Vector Machine (SVM)
Results
 Random Forests with 300 trees gave 77.9%
accuracy on the test data.
 SVM with RBF kernel and C=1, gave 77.7%
accuracy on the test data.

Mais conteúdo relacionado

Mais procurados

Customer Segmentation using Clustering
Customer Segmentation using ClusteringCustomer Segmentation using Clustering
Customer Segmentation using ClusteringDessy Amirudin
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierNeha Kulkarni
 
Lecture 8: Decision Trees & k-Nearest Neighbors
Lecture 8: Decision Trees & k-Nearest NeighborsLecture 8: Decision Trees & k-Nearest Neighbors
Lecture 8: Decision Trees & k-Nearest NeighborsMarina Santini
 
Chapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text miningChapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text miningHouw Liong The
 
Pillar k means
Pillar k meansPillar k means
Pillar k meansswathi b
 
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
K-MEDOIDS CLUSTERING  USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...K-MEDOIDS CLUSTERING  USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...ijscmc
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methodsKrish_ver2
 
Clustering
ClusteringClustering
ClusteringMeme Hei
 
"k-means-clustering" presentation @ Papers We Love Bucharest
"k-means-clustering" presentation @ Papers We Love Bucharest"k-means-clustering" presentation @ Papers We Love Bucharest
"k-means-clustering" presentation @ Papers We Love BucharestAdrian Florea
 
A Novel Algorithm for Design Tree Classification with PCA
A Novel Algorithm for Design Tree Classification with PCAA Novel Algorithm for Design Tree Classification with PCA
A Novel Algorithm for Design Tree Classification with PCAEditor Jacotech
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clusteringArshad Farhad
 

Mais procurados (20)

Customer Segmentation using Clustering
Customer Segmentation using ClusteringCustomer Segmentation using Clustering
Customer Segmentation using Clustering
 
Clustering: A Survey
Clustering: A SurveyClustering: A Survey
Clustering: A Survey
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor Classifier
 
Lecture 8: Decision Trees & k-Nearest Neighbors
Lecture 8: Decision Trees & k-Nearest NeighborsLecture 8: Decision Trees & k-Nearest Neighbors
Lecture 8: Decision Trees & k-Nearest Neighbors
 
Introduction to data mining and machine learning
Introduction to data mining and machine learningIntroduction to data mining and machine learning
Introduction to data mining and machine learning
 
KNN
KNN KNN
KNN
 
Data clustering
Data clustering Data clustering
Data clustering
 
Chapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text miningChapter 11 cluster advanced : web and text mining
Chapter 11 cluster advanced : web and text mining
 
Dbm630 lecture09
Dbm630 lecture09Dbm630 lecture09
Dbm630 lecture09
 
Pillar k means
Pillar k meansPillar k means
Pillar k means
 
Cluster Analysis for Dummies
Cluster Analysis for DummiesCluster Analysis for Dummies
Cluster Analysis for Dummies
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
K-MEDOIDS CLUSTERING  USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...K-MEDOIDS CLUSTERING  USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methods
 
Kmeans
KmeansKmeans
Kmeans
 
Clustering
ClusteringClustering
Clustering
 
Dataa miining
Dataa miiningDataa miining
Dataa miining
 
"k-means-clustering" presentation @ Papers We Love Bucharest
"k-means-clustering" presentation @ Papers We Love Bucharest"k-means-clustering" presentation @ Papers We Love Bucharest
"k-means-clustering" presentation @ Papers We Love Bucharest
 
A Novel Algorithm for Design Tree Classification with PCA
A Novel Algorithm for Design Tree Classification with PCAA Novel Algorithm for Design Tree Classification with PCA
A Novel Algorithm for Design Tree Classification with PCA
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 

Semelhante a Kaggle Projects Presentation Sawinder Pal Kaur

Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Zihui Li
 
Application of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosisApplication of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosisDr.Pooja Jain
 
MLHEP Lectures - day 1, basic track
MLHEP Lectures - day 1, basic trackMLHEP Lectures - day 1, basic track
MLHEP Lectures - day 1, basic trackarogozhnikov
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningAmAn Singh
 
Introduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesIntroduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesXavier Rafael Palou
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习AdaboostShocky1
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier홍배 김
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningcsandit
 
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...cscpconf
 
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGcsandit
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171Yaxin Liu
 
Classifiers
ClassifiersClassifiers
ClassifiersAyurdata
 
Introduction to Support Vector Machines
Introduction to Support Vector MachinesIntroduction to Support Vector Machines
Introduction to Support Vector MachinesSilicon Mentor
 
Machine learning for_finance
Machine learning for_financeMachine learning for_finance
Machine learning for_financeStefan Duprey
 
8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithmLaura Petrosanu
 
Event classification & prediction using support vector machine
Event classification & prediction using support vector machineEvent classification & prediction using support vector machine
Event classification & prediction using support vector machineRuta Kambli
 

Semelhante a Kaggle Projects Presentation Sawinder Pal Kaur (20)

Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
Application of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosisApplication of combined support vector machines in process fault diagnosis
Application of combined support vector machines in process fault diagnosis
 
MLHEP Lectures - day 1, basic track
MLHEP Lectures - day 1, basic trackMLHEP Lectures - day 1, basic track
MLHEP Lectures - day 1, basic track
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Introduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesIntroduction to conventional machine learning techniques
Introduction to conventional machine learning techniques
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion mining
 
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
 
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
 
Lect4
Lect4Lect4
Lect4
 
Text categorization
Text categorizationText categorization
Text categorization
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
 
Classifiers
ClassifiersClassifiers
Classifiers
 
Introduction to Support Vector Machines
Introduction to Support Vector MachinesIntroduction to Support Vector Machines
Introduction to Support Vector Machines
 
Neural networks
Neural networksNeural networks
Neural networks
 
Machine learning for_finance
Machine learning for_financeMachine learning for_finance
Machine learning for_finance
 
8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm
 
Event classification & prediction using support vector machine
Event classification & prediction using support vector machineEvent classification & prediction using support vector machine
Event classification & prediction using support vector machine
 

Último

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 

Último (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

Kaggle Projects Presentation Sawinder Pal Kaur

  • 1. Sawinder Pal Kaur, PhD Kaggle Projects
  • 2. Outline  Problem  Statement  Methods used  Results
  • 3. Problem: Digit Recognizer  Identify handwritten single digits 0~9, based on grey scale images. Sample images
  • 4. Statement Each image is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total. Each pixel has a single pixel- value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker. This pixel-value is an integer between 0 and 255, inclusive. pixel0 pixel1 pixel2 ... pixel27 pixel28 pixel29 pixel30 ... pixel55 | | | ... | pixel756 pixel757 pixel758 ... pixel783
  • 5. Statement  The training data set, has 785 columns. The first column, called "label", is the digit that was drawn by the user. The rest of the columns contain the pixel-values of the associated image.  The test data set, is the same as the training set, except that it does not contain the "label" column.  Goal of the problem is to predict the images in the test data set
  • 6. Methods used to solve the problem  Random Forest  Support Vector Machine (SVM)  K-Nearest Neighborhood (KNN)
  • 7. Random Forest  Ensemble of decision trees  Each tree is trained on a bootstrapped sample of the original data set  Each time a node is split, only a randomly chosen subset of the dimensions are considered for splitting  Each tree is fully grown and not pruned  When a new input is entered into the system, it is run down all of the trees. The result may either be an average or weighted average of all of the terminal nodes that are reached, or, in the case of categorical variables, a voting majority
  • 9. Support Vector Machine  In a SVM model original objects (training data) are treated as a points in the space (input space)  These are mapped (rearranged) to a new space (feature space) using mathematical functions called kernels  After mapping objects of separate categories are divided by a clear gap as wide as possible
  • 10. K Nearest Neighborhood  Basic idea  If it walks like a duck, quacks like a duck than it is probably a duck  There are three key elements :  a set of labeled objects (e.g., a set of stored records)  a distance or similarity metric to compute distance between objects, and  the value of k, the number of nearest neighbors.  To classify an unlabeled object :  the distance of this object to the labeled objects is computed,  its k-nearest neighbors are identified, and  the class labels of these nearest neighbors are then used to determine the class label of the object.
  • 11. Results  Random Forests with 500 trees gave 97% accuracy on the test data.  SVM with RBF kernel and C=1, gave 97.71% accuracy on the test data.  KNN with k=10 gave 96% accuracy.
  • 13. Problem  The sinking of the RMS Titanic is one of the most infamous shipwrecks in history.  One of the reasons that the shipwreck led to such loss of life was that there were not enough lifeboats for the passengers and crew. Although there was some element of luck involved in surviving the sinking, some groups of people were more likely to survive than others, such as women, children, and the upper-class.  In this project, the analysis of what sorts of people were likely to survive is done. In particular, the tools of machine learning are applied to predict which passengers survived the tragedy.
  • 14. Statement  The historical data has been split into two groups, a 'training set' and a 'test set'. For the training set, the outcome whether or not the passenger survived the sinking ( 0 for deceased, 1 for survived ) is provided.  The goal of the problem is to predict the outcome for each passenger in the test set.
  • 15. Methods used to solve the problem • Random Forest • Support Vector Machine (SVM)
  • 16. Results  Random Forests with 300 trees gave 77.9% accuracy on the test data.  SVM with RBF kernel and C=1, gave 77.7% accuracy on the test data.