SlideShare a Scribd company logo
1 of 33
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
k-means clustering
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
What Will You Learn Today?
Cluster analysisIntroduction to
Machine Learning
Types of clustering
Introduction to k-
means clustering
How k-means
clustering work?
Demo in R: Netflix
use-case
1 2 3
4 65
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
What is Machine learning?
Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn without
being explicitly programmed.
Training Data Learn
Algorithm
Build Model Perform
Feedback
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
ML Use Case – Google self driving car
 Google self driving car is a smart, driverless car.
 It collects data from environment through
sensors.
 Takes decisions like when to speed up, when to
speed down, when to overtake and when to
turn.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Types of Machine Learning
Supervised
learning
Unsupervised
learning
Feed the classifier with training data set and predefined labels.
It will learn to categorize particular data under a specific label.
When and where
should I buy a
house?
House features
Area crime rate
Bedrooms
Distance to HQ
Area (in sq.ft)
Locality
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Types of Machine Learning
Supervised
learning
Unsupervised
learning
An image of fruits is first fed into the system.
The system identifies different fruits using features like color, size and it categorizes them.
When a new fruit is shown, it analyses its features and puts it into the category having
similar featured items.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Cluster Analysis
Unsupervised
Learning
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
What is Clustering?
Clustering means grouping of objects based on the information found in the data describing the objects or their
relationship.
 The goal is that objects in one group should be similar to each other but different from objects in another group.
 It deals with finding a structure in a collection of unlabeled data.
Some Examples of clustering methods are:
 K-means Clustering
 Fuzzy/ C-means Clustering
 Hierarchical Clustering
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Clustering Use Cases
Marketing
Seismic studiesLand use
Insurance
Marketing
Discovering distinct groups in customer databases,
such as customers who make lot of long-distance
calls.
Insurance
Identifying groups of crop insurance policy holders
with a high average claim rate. Farmers crash crops,
when it is “profitable”.
Land use
Identification of areas of similar land use in a GIS
database.
Seismic studies
Identifying probable areas for oil/gas exploration
based on seismic data
Use-cases
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Types of clustering
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Types of Clustering
Exclusive Clustering
• An item belongs exclusively to
one cluster, not several.
• K-means does this sort of
exclusive clustering.
• An item can belong to multiple
clusters
• Its degree of association with each
cluster is known
• Fuzzy/ C-means does this sort of
exclusive clustering.
Overlapping Clustering Hierarchial Clustering
• When two cluster have a parent-
child relationship or a tree-like
structure then it is Hierarchical
clustering
Cluster 1
Cluster 2
Cluster 0
Cluster 2
Cluster 1
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
K-means clustering
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
K-means clustering
k-means
clustering
k-means clustering is one of the
simplest algorithms which uses
unsupervised learning method to
solve known clustering issues.
Divides entire dataset into k clusters.
k-means clustering require following
two inputs.
1. K = number of clusters
2. Training set(m) = {x1, x2, x3,......, xm}
Total population
Group 2 Group 3Group 1 Group 4
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Example - Google News
Various news URLs related to Trump and Modi are grouped under one section.
K-means clustering automatically clusters new stories about the same topic into pre-defined clusters.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Example
I need to find specific
locations to build
schools in this area so
that the students
doesn’t have to travel
much
The plot of students in an area is as given below,
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Example - Solution
This looks good
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
But how did he do
that?...
I’ll show you how
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
How k-means work?
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
How k-means work?
Choose number of clusters
Initialization
Cluster assignment
Move centroid
Optimization
Convergence
The WSS is defined as the sum of the squared distance between each member of the
cluster and its centroid.
Mathematically:
where, p(i)= data point
q(i)= closest centroid to data point
The idea of the elbow method is to choose the k after which the WSS decrease
is almost constant.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
How k-means work?
Choose number of clusters
Initialization
Cluster assignment
Move centroid
Optimization
Convergence
Cluster
centroid
X-axis
Y-axis
Randomly initialize k points called the cluster centroids.
Here, k = 2
Value of k(number of clusters) can be determined by the elbow curve.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
How k-means work?
Choose number of clusters
Initialization
Cluster assignment
Move centroid
Optimization
Convergence
 Compute the distance between the data points and the cluster
centroid initialized.
 Depending upon the minimum distance, data points are divided into two
groups.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
How k-means work?
Choose number of clusters
Initialization
Cluster assignment
Move centroid
Optimization
Convergence
 Compute the mean of blue dots.
 Reposition blue cluster centroid to this mean.
 Compute the mean of orange dots.
 Reposition orange cluster centroid to this mean.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
How k-means work?
Choose number of clusters
Initialization
Cluster assignment
Move centroid
Optimization
Convergence
Repeat previous two steps iteratively till the cluster centroids stop changing their
positions.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
How k-means work?
Choose number of clusters
Initialization
Cluster assignment
Move centroid
Optimization
Convergence
 Finally, k-means clustering algorithm converges.
 Divides the data points into two clusters clearly visible in orange and blue.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Problem Statement
Challenge: Netflix wanted to increase its business by showing most popular movies on its website.
Solution: So, Netflix decided to group the movies based on budget, gross and facebook likes
Approach: For this, Netflix took imdb dataset of 5000 values and applied k-means clustering to group it.
But how would I
know which movie
set to show and
which to not ?
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Demo
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Solution – R Script
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Output
 We got three clusters based on budget and gross.
 Lets see how good are these clusters.
 Using command cl gives following output.
Within cluster sum of squares by cluster:
(between_SS / total _ SS = 72.4 %)
 Higher the %age value, better is the model.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Further, lets relate cluster assignment to individual characteristics like director facebook likes(column 5) and movie
facebook likes(column 28). Cluster 2 has maximum movie likes as well as director likes.
Output
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Try this out
I want to know the profit
values of movie
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Hmm… I will go with cluster
2. It is making maximum
profit and has maximum
facebook likes.
www.edureka.co/data-scienceEdureka’s Data Science Certification Training
Course Details
Go to www.edureka.co/data-science
Get Edureka Certified in Data Science Today!
What our learners have to say about us!
Shravan Reddy says- “I would like to recommend any one who
wants to be a Data Scientist just one place: Edureka. Explanations
are clean, clear, easy to understand. Their support team works
very well.. I took the Data Science course and I'm going to take
Machine Learning with Mahout and then Big Data and Hadoop”.
Gnana Sekhar says - “Edureka Data science course provided me a very
good mixture of theoretical and practical training. LMS pre recorded
sessions and assignments were very good as there is a lot of
information in them that will help me in my job. Edureka is my
teaching GURU now...Thanks EDUREKA.”
Balu Samaga says - “It was a great experience to undergo and get
certified in the Data Science course from Edureka. Quality of the
training materials, assignments, project, support and other
infrastructures are a top notch.”
www.edureka.co/data-scienceEdureka’s Data Science Certification Training

More Related Content

What's hot

Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clusteringArshad Farhad
 
Feature selection
Feature selectionFeature selection
Feature selectionDong Guo
 
K-means clustering algorithm
K-means clustering algorithmK-means clustering algorithm
K-means clustering algorithmVinit Dantkale
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade offVARUN KUMAR
 
Data Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model SelectionData Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model SelectionDerek Kane
 
How to choose Machine Learning algorithm.
How to choose Machine Learning  algorithm.How to choose Machine Learning  algorithm.
How to choose Machine Learning algorithm.Mala Deep Upadhaya
 
Clustering, k-means clustering
Clustering, k-means clusteringClustering, k-means clustering
Clustering, k-means clusteringMegha Sharma
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkKnoldus Inc.
 
Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods Marina Santini
 
Data preprocessing in Machine learning
Data preprocessing in Machine learning Data preprocessing in Machine learning
Data preprocessing in Machine learning pyingkodi maran
 
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...Edureka!
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisJaclyn Kokx
 
Supervised Machine Learning
Supervised Machine LearningSupervised Machine Learning
Supervised Machine LearningAnkit Rai
 
Linear regression
Linear regressionLinear regression
Linear regressionMartinHogg9
 
Linear Regression vs Logistic Regression | Edureka
Linear Regression vs Logistic Regression | EdurekaLinear Regression vs Logistic Regression | Edureka
Linear Regression vs Logistic Regression | EdurekaEdureka!
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierNeha Kulkarni
 
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...Simplilearn
 
Machine Learning-Linear regression
Machine Learning-Linear regressionMachine Learning-Linear regression
Machine Learning-Linear regressionkishanthkumaar
 

What's hot (20)

Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Feature selection
Feature selectionFeature selection
Feature selection
 
K-means clustering algorithm
K-means clustering algorithmK-means clustering algorithm
K-means clustering algorithm
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade off
 
Data Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model SelectionData Science - Part III - EDA & Model Selection
Data Science - Part III - EDA & Model Selection
 
KNN
KNN KNN
KNN
 
How to choose Machine Learning algorithm.
How to choose Machine Learning  algorithm.How to choose Machine Learning  algorithm.
How to choose Machine Learning algorithm.
 
Clustering, k-means clustering
Clustering, k-means clusteringClustering, k-means clustering
Clustering, k-means clustering
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods
 
Data preprocessing in Machine learning
Data preprocessing in Machine learning Data preprocessing in Machine learning
Data preprocessing in Machine learning
 
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant Analysis
 
Supervised Machine Learning
Supervised Machine LearningSupervised Machine Learning
Supervised Machine Learning
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Linear Regression vs Logistic Regression | Edureka
Linear Regression vs Logistic Regression | EdurekaLinear Regression vs Logistic Regression | Edureka
Linear Regression vs Logistic Regression | Edureka
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor Classifier
 
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
Decision Tree Algorithm With Example | Decision Tree In Machine Learning | Da...
 
Machine Learning-Linear regression
Machine Learning-Linear regressionMachine Learning-Linear regression
Machine Learning-Linear regression
 

Similar to K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm | Edureka

Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples Edureka!
 
K-Means Clustering Explained_ Algorithm And Sklearn Implementation _ by Mariu...
K-Means Clustering Explained_ Algorithm And Sklearn Implementation _ by Mariu...K-Means Clustering Explained_ Algorithm And Sklearn Implementation _ by Mariu...
K-Means Clustering Explained_ Algorithm And Sklearn Implementation _ by Mariu...christopher corlett
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Edureka!
 
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...Edureka!
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningHoa Le
 
Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...Paolo Missier
 
Business Analytics with R
Business Analytics with RBusiness Analytics with R
Business Analytics with REdureka!
 
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...Edureka!
 
data-science-pdf-16588.pdf
data-science-pdf-16588.pdfdata-science-pdf-16588.pdf
data-science-pdf-16588.pdfvkharish18
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Edureka!
 
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...theijes
 
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Tutori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Tutori...Machine Learning Algorithms | Machine Learning Tutorial | Data Science Tutori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Tutori...Edureka!
 
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)Qazi Maaz Arshad
 
Data Science Full Course | Edureka
Data Science Full Course | EdurekaData Science Full Course | Edureka
Data Science Full Course | EdurekaEdureka!
 
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerStudy and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerIJERA Editor
 
KNN Algorithm Using R | Edureka
KNN Algorithm Using R | EdurekaKNN Algorithm Using R | Edureka
KNN Algorithm Using R | EdurekaEdureka!
 
Business Analytics Decision Tree in R
Business Analytics Decision Tree in RBusiness Analytics Decision Tree in R
Business Analytics Decision Tree in REdureka!
 
Data Science : Make Smarter Business Decisions
Data Science : Make Smarter Business DecisionsData Science : Make Smarter Business Decisions
Data Science : Make Smarter Business DecisionsEdureka!
 
Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamDoug Needham
 

Similar to K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm | Edureka (20)

Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples
 
K-Means Clustering Explained_ Algorithm And Sklearn Implementation _ by Mariu...
K-Means Clustering Explained_ Algorithm And Sklearn Implementation _ by Mariu...K-Means Clustering Explained_ Algorithm And Sklearn Implementation _ by Mariu...
K-Means Clustering Explained_ Algorithm And Sklearn Implementation _ by Mariu...
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
 
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearning
 
Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
 
Business Analytics with R
Business Analytics with RBusiness Analytics with R
Business Analytics with R
 
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
 
data-science-pdf-16588.pdf
data-science-pdf-16588.pdfdata-science-pdf-16588.pdf
data-science-pdf-16588.pdf
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
 
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
 
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Tutori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Tutori...Machine Learning Algorithms | Machine Learning Tutorial | Data Science Tutori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Tutori...
 
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
 
Data Science Full Course | Edureka
Data Science Full Course | EdurekaData Science Full Course | Edureka
Data Science Full Course | Edureka
 
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerStudy and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
 
KNN Algorithm Using R | Edureka
KNN Algorithm Using R | EdurekaKNN Algorithm Using R | Edureka
KNN Algorithm Using R | Edureka
 
Business Analytics Decision Tree in R
Business Analytics Decision Tree in RBusiness Analytics Decision Tree in R
Business Analytics Decision Tree in R
 
3 classification
3  classification3  classification
3 classification
 
Data Science : Make Smarter Business Decisions
Data Science : Make Smarter Business DecisionsData Science : Make Smarter Business Decisions
Data Science : Make Smarter Business Decisions
 
Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug Needham
 

More from Edureka!

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaEdureka!
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaEdureka!
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaEdureka!
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaEdureka!
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaEdureka!
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaEdureka!
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaEdureka!
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaEdureka!
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaEdureka!
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaEdureka!
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | EdurekaEdureka!
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEdureka!
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEdureka!
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaEdureka!
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaEdureka!
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaEdureka!
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Edureka!
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaEdureka!
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaEdureka!
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | EdurekaEdureka!
 

More from Edureka! (20)

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
 

Recently uploaded

➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 

Recently uploaded (20)

➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 

K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm | Edureka

  • 1. www.edureka.co/data-scienceEdureka’s Data Science Certification Training k-means clustering
  • 2. www.edureka.co/data-scienceEdureka’s Data Science Certification Training What Will You Learn Today? Cluster analysisIntroduction to Machine Learning Types of clustering Introduction to k- means clustering How k-means clustering work? Demo in R: Netflix use-case 1 2 3 4 65
  • 3. www.edureka.co/data-scienceEdureka’s Data Science Certification Training What is Machine learning? Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed. Training Data Learn Algorithm Build Model Perform Feedback
  • 4. www.edureka.co/data-scienceEdureka’s Data Science Certification Training ML Use Case – Google self driving car  Google self driving car is a smart, driverless car.  It collects data from environment through sensors.  Takes decisions like when to speed up, when to speed down, when to overtake and when to turn.
  • 5. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Types of Machine Learning Supervised learning Unsupervised learning Feed the classifier with training data set and predefined labels. It will learn to categorize particular data under a specific label. When and where should I buy a house? House features Area crime rate Bedrooms Distance to HQ Area (in sq.ft) Locality
  • 6. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Types of Machine Learning Supervised learning Unsupervised learning An image of fruits is first fed into the system. The system identifies different fruits using features like color, size and it categorizes them. When a new fruit is shown, it analyses its features and puts it into the category having similar featured items.
  • 7. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Cluster Analysis Unsupervised Learning
  • 8. www.edureka.co/data-scienceEdureka’s Data Science Certification Training What is Clustering? Clustering means grouping of objects based on the information found in the data describing the objects or their relationship.  The goal is that objects in one group should be similar to each other but different from objects in another group.  It deals with finding a structure in a collection of unlabeled data. Some Examples of clustering methods are:  K-means Clustering  Fuzzy/ C-means Clustering  Hierarchical Clustering
  • 9. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Clustering Use Cases Marketing Seismic studiesLand use Insurance Marketing Discovering distinct groups in customer databases, such as customers who make lot of long-distance calls. Insurance Identifying groups of crop insurance policy holders with a high average claim rate. Farmers crash crops, when it is “profitable”. Land use Identification of areas of similar land use in a GIS database. Seismic studies Identifying probable areas for oil/gas exploration based on seismic data Use-cases
  • 10. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Types of clustering
  • 11. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Types of Clustering Exclusive Clustering • An item belongs exclusively to one cluster, not several. • K-means does this sort of exclusive clustering. • An item can belong to multiple clusters • Its degree of association with each cluster is known • Fuzzy/ C-means does this sort of exclusive clustering. Overlapping Clustering Hierarchial Clustering • When two cluster have a parent- child relationship or a tree-like structure then it is Hierarchical clustering Cluster 1 Cluster 2 Cluster 0 Cluster 2 Cluster 1
  • 12. www.edureka.co/data-scienceEdureka’s Data Science Certification Training K-means clustering
  • 13. www.edureka.co/data-scienceEdureka’s Data Science Certification Training K-means clustering k-means clustering k-means clustering is one of the simplest algorithms which uses unsupervised learning method to solve known clustering issues. Divides entire dataset into k clusters. k-means clustering require following two inputs. 1. K = number of clusters 2. Training set(m) = {x1, x2, x3,......, xm} Total population Group 2 Group 3Group 1 Group 4
  • 14. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Example - Google News Various news URLs related to Trump and Modi are grouped under one section. K-means clustering automatically clusters new stories about the same topic into pre-defined clusters.
  • 15. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Example I need to find specific locations to build schools in this area so that the students doesn’t have to travel much The plot of students in an area is as given below,
  • 16. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Example - Solution This looks good
  • 17. www.edureka.co/data-scienceEdureka’s Data Science Certification Training But how did he do that?... I’ll show you how
  • 18. www.edureka.co/data-scienceEdureka’s Data Science Certification Training How k-means work?
  • 19. www.edureka.co/data-scienceEdureka’s Data Science Certification Training How k-means work? Choose number of clusters Initialization Cluster assignment Move centroid Optimization Convergence The WSS is defined as the sum of the squared distance between each member of the cluster and its centroid. Mathematically: where, p(i)= data point q(i)= closest centroid to data point The idea of the elbow method is to choose the k after which the WSS decrease is almost constant.
  • 20. www.edureka.co/data-scienceEdureka’s Data Science Certification Training How k-means work? Choose number of clusters Initialization Cluster assignment Move centroid Optimization Convergence Cluster centroid X-axis Y-axis Randomly initialize k points called the cluster centroids. Here, k = 2 Value of k(number of clusters) can be determined by the elbow curve.
  • 21. www.edureka.co/data-scienceEdureka’s Data Science Certification Training How k-means work? Choose number of clusters Initialization Cluster assignment Move centroid Optimization Convergence  Compute the distance between the data points and the cluster centroid initialized.  Depending upon the minimum distance, data points are divided into two groups.
  • 22. www.edureka.co/data-scienceEdureka’s Data Science Certification Training How k-means work? Choose number of clusters Initialization Cluster assignment Move centroid Optimization Convergence  Compute the mean of blue dots.  Reposition blue cluster centroid to this mean.  Compute the mean of orange dots.  Reposition orange cluster centroid to this mean.
  • 23. www.edureka.co/data-scienceEdureka’s Data Science Certification Training How k-means work? Choose number of clusters Initialization Cluster assignment Move centroid Optimization Convergence Repeat previous two steps iteratively till the cluster centroids stop changing their positions.
  • 24. www.edureka.co/data-scienceEdureka’s Data Science Certification Training How k-means work? Choose number of clusters Initialization Cluster assignment Move centroid Optimization Convergence  Finally, k-means clustering algorithm converges.  Divides the data points into two clusters clearly visible in orange and blue.
  • 25. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Problem Statement Challenge: Netflix wanted to increase its business by showing most popular movies on its website. Solution: So, Netflix decided to group the movies based on budget, gross and facebook likes Approach: For this, Netflix took imdb dataset of 5000 values and applied k-means clustering to group it. But how would I know which movie set to show and which to not ?
  • 27. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Solution – R Script
  • 28. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Output  We got three clusters based on budget and gross.  Lets see how good are these clusters.  Using command cl gives following output. Within cluster sum of squares by cluster: (between_SS / total _ SS = 72.4 %)  Higher the %age value, better is the model.
  • 29. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Further, lets relate cluster assignment to individual characteristics like director facebook likes(column 5) and movie facebook likes(column 28). Cluster 2 has maximum movie likes as well as director likes. Output
  • 30. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Try this out I want to know the profit values of movie
  • 31. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Hmm… I will go with cluster 2. It is making maximum profit and has maximum facebook likes.
  • 32. www.edureka.co/data-scienceEdureka’s Data Science Certification Training Course Details Go to www.edureka.co/data-science Get Edureka Certified in Data Science Today! What our learners have to say about us! Shravan Reddy says- “I would like to recommend any one who wants to be a Data Scientist just one place: Edureka. Explanations are clean, clear, easy to understand. Their support team works very well.. I took the Data Science course and I'm going to take Machine Learning with Mahout and then Big Data and Hadoop”. Gnana Sekhar says - “Edureka Data science course provided me a very good mixture of theoretical and practical training. LMS pre recorded sessions and assignments were very good as there is a lot of information in them that will help me in my job. Edureka is my teaching GURU now...Thanks EDUREKA.” Balu Samaga says - “It was a great experience to undergo and get certified in the Data Science course from Edureka. Quality of the training materials, assignments, project, support and other infrastructures are a top notch.”

Editor's Notes

  1. Add photos