SlideShare uma empresa Scribd logo
1 de 17
CLUSTERING
Presented By:
SHARBANI DEY
LIPIKA SAHA
INTRODUCTION
 Clustering is an unsupervised learning method of data abstraction.
 The method of identifying similar groups of data in a dataset is
called Clustering.
 It is basically a collection of objects on the basis of similarity and
dissimilarity between them.
TYPES OF CLUSTERING
 Hard Clustering
In hard clustering, each data point either belongs to a cluster
completely or not.
 Soft Clustering
Soft clustering is about grouping the data items such that
an item can exists in multiple clusters.
CLUSTERING METHODS
Density-Based Methods :
These method search the data space for areas of varied density of data points in
the data space.
Hierarchical Based Methods:
In this method, the clusters forms a tree-type structure based on the hierarchy
New clusters are formed using the previously formed one.
It is divided into two category
• Agglomerative
• Divisive
Partitioning Based Methods:
These methods partition the objects into k cluster and each partition forms
one cluster.
example :- K means
Grid-Based Methods:
In this method, the data space is formulated into a finite number of cells
that form a grid-like structure.
K Means Clustering
 It is an algorithm to group similar elements or data points to cluster.
 The number of groups or cluster is represented by k.
 It assumes that the object attribute forms a vector space based on features
that are already provided.
K Means Clustering Algorithm
Step 1: First we initialize k points, called means, randomly.
Step 2:We categorize each item to its closest mean and we update the mean’s
coordinates, which are the averages of the items categorized in that mean so
far.
Step 3: We repeat the process for a given number of iterations and at the end,
we have our clusters.
Example of K-means Clustering
Let us consider a table
Individual Height Weight
1 185 72
2 170 56
3 168 60
4 179 68
5 182 72
Step 1: Randomly we choose two centroids for two clusters
k1=(185,72)
k2=(170,56)
Step 2: Now using these centroids we compute Eucledian Distance 3rd point
ED=sqrt[(xo-xc)^2+(y0-yc)^2]
k1=sqrt[(168-185)^2+(60-72)^2]
k1=20.80
k2=sqrt[(168-170)^2+(60-56)^2]
k2=4.48
Therefore 3 belongs to k2
Step 3: Calculate new centroid values for k2
k2=[(170+168)/2 , (60+56)/2]
k2=(169,58)
Individual Height Weight
1 185 72
2 170 56
3 168 60
4 179 68
5 182 72
K1={1,4,5}
K2={2,3}
Individual k1 K2
3 20.80 4.48
4 6.32 14.14
5 2 12.56
Hierarchical Clustering
 Hierarchical Clustering finds successive clusters using previously
established clusters.
 No Assumptions on the number of clusters.
Agglomerative Hierarchical Clustering
 Initially consider every data point as an individual Cluster and at every
step, merge the nearest pairs of the cluster.
It is a bottom-up method.
At first every data set is considered as individual entity or cluster.
At every iteration, the clusters merge with different clusters until one
cluster is formed.
Example of Agglomerative Hierarchical
Clustering
Divisive Hierarchical Clustering
Divisive Hierarchical clustering is precisely the opposite of the
Agglomerative Hierarchical clustering.
In Divisive Hierarchical clustering, we take into account all of the data
points as a single cluster.
In every iteration, we separate the data points from the clusters which
aren’t comparable.
In the end, we are left with N clusters.
Example of Divisive Hierarchical Clustering
Reference
• https://www.edureka.co/data-science-python-certification-course
• https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering-
and-different-methods-of-
clustering/#:~:text=Clustering%20is%20the%20task%20of,and%20assign%20t
hem%20into%20clusters
• https://www.google.com/amp/s/www.geeksforgeeks.org/clustering-in-machine-
learning/amp/
• https://towardsdatascience.com/k-means-clustering-algorithm-applications-
evaluation-methods-and-drawbacks-aa03e644b48a
• https://www.kdnuggets.com/2019/09/hierarchical-clustering.html
• https://towardsdatascience.com/hierarchical-clustering-agglomerative-and-
divisive-explained-342e6b20d710
• https://towardsdatascience.com/understanding-the-concept-of-hierarchical-
clustering-technique-c6e8243758ec
• https://developers.google.com/machine-learning/clustering/overview
• https://www.google.com/amp/s/www.geeksforgeeks.org/hierarchical-
clustering-in-data-mining/amp/
• https://www.google.com/amp/s/www.geeksforgeeks.org/k-means-clustering-
introduction/amp/
THANKYOU

Mais conteúdo relacionado

Mais procurados

Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithmhadifar
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data MiningValerii Klymchuk
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methodsKrish_ver2
 
K MEANS CLUSTERING
K MEANS CLUSTERINGK MEANS CLUSTERING
K MEANS CLUSTERINGsingh7599
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsSalah Amean
 
Data mining technique (decision tree)
Data mining technique (decision tree)Data mining technique (decision tree)
Data mining technique (decision tree)Shweta Ghate
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)Pravinkumar Landge
 
Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CARTXueping Peng
 
Machine Learning Clustering
Machine Learning ClusteringMachine Learning Clustering
Machine Learning ClusteringRupak Roy
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysisDataminingTools Inc
 
k medoid clustering.pptx
k medoid clustering.pptxk medoid clustering.pptx
k medoid clustering.pptxRoshan86572
 

Mais procurados (20)

Introduction to Clustering algorithm
Introduction to Clustering algorithmIntroduction to Clustering algorithm
Introduction to Clustering algorithm
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 
Presentation on K-Means Clustering
Presentation on K-Means ClusteringPresentation on K-Means Clustering
Presentation on K-Means Clustering
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methods
 
K MEANS CLUSTERING
K MEANS CLUSTERINGK MEANS CLUSTERING
K MEANS CLUSTERING
 
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic ConceptsData Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Knn 160904075605-converted
Knn 160904075605-convertedKnn 160904075605-converted
Knn 160904075605-converted
 
Data mining technique (decision tree)
Data mining technique (decision tree)Data mining technique (decision tree)
Data mining technique (decision tree)
 
Clustering
ClusteringClustering
Clustering
 
K mean-clustering
K mean-clusteringK mean-clustering
K mean-clustering
 
Clustering
ClusteringClustering
Clustering
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
 
Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CART
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 
Kmeans
KmeansKmeans
Kmeans
 
Machine Learning Clustering
Machine Learning ClusteringMachine Learning Clustering
Machine Learning Clustering
 
Cluster Analysis
Cluster Analysis Cluster Analysis
Cluster Analysis
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysis
 
k medoid clustering.pptx
k medoid clustering.pptxk medoid clustering.pptx
k medoid clustering.pptx
 

Semelhante a Clustering

Lecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.pptLecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.pptSyedNahin1
 
CLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxCLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxShwetapadmaBabu1
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.pptvikassingh569137
 
Unsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and AssumptionsUnsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and Assumptionsrefedey275
 
Survey on Unsupervised Learning in Datamining
Survey on Unsupervised Learning in DataminingSurvey on Unsupervised Learning in Datamining
Survey on Unsupervised Learning in DataminingIOSR Journals
 
Unsupervised Learning in Machine Learning
Unsupervised Learning in Machine LearningUnsupervised Learning in Machine Learning
Unsupervised Learning in Machine LearningPyingkodi Maran
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in RSudhakar Chavan
 
MODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxMODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxnikshaikh786
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithmijsrd.com
 
Clustering & classification
Clustering & classificationClustering & classification
Clustering & classificationJamshed Khan
 
K-Means clustring @jax
K-Means clustring @jaxK-Means clustring @jax
K-Means clustring @jaxAjay Iet
 
iiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdfiiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdfVIKASGUPTA127897
 
Unsupervised learning Modi.pptx
Unsupervised learning Modi.pptxUnsupervised learning Modi.pptx
Unsupervised learning Modi.pptxssusere1fd42
 
K means Clustering - algorithm to cluster n objects
K means Clustering - algorithm to cluster n objectsK means Clustering - algorithm to cluster n objects
K means Clustering - algorithm to cluster n objectsVoidVampire
 

Semelhante a Clustering (20)

Lecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.pptLecture_3_k-mean-clustering.ppt
Lecture_3_k-mean-clustering.ppt
 
CLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxCLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptx
 
26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt26-Clustering MTech-2017.ppt
26-Clustering MTech-2017.ppt
 
kmean clustering
kmean clusteringkmean clustering
kmean clustering
 
Unsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and AssumptionsUnsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and Assumptions
 
Clustering
ClusteringClustering
Clustering
 
Survey on Unsupervised Learning in Datamining
Survey on Unsupervised Learning in DataminingSurvey on Unsupervised Learning in Datamining
Survey on Unsupervised Learning in Datamining
 
Unsupervised Learning in Machine Learning
Unsupervised Learning in Machine LearningUnsupervised Learning in Machine Learning
Unsupervised Learning in Machine Learning
 
machine learning - Clustering in R
machine learning - Clustering in Rmachine learning - Clustering in R
machine learning - Clustering in R
 
MODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptxMODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptx
 
Lec13 Clustering.pptx
Lec13 Clustering.pptxLec13 Clustering.pptx
Lec13 Clustering.pptx
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
 
Clustering & classification
Clustering & classificationClustering & classification
Clustering & classification
 
K means clustring @jax
K means clustring @jaxK means clustring @jax
K means clustring @jax
 
K-Means clustring @jax
K-Means clustring @jaxK-Means clustring @jax
K-Means clustring @jax
 
Clustering.pdf
Clustering.pdfClustering.pdf
Clustering.pdf
 
iiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdfiiit delhi unsupervised pdf.pdf
iiit delhi unsupervised pdf.pdf
 
Unsupervised learning Modi.pptx
Unsupervised learning Modi.pptxUnsupervised learning Modi.pptx
Unsupervised learning Modi.pptx
 
PPT s10-machine vision-s2
PPT s10-machine vision-s2PPT s10-machine vision-s2
PPT s10-machine vision-s2
 
K means Clustering - algorithm to cluster n objects
K means Clustering - algorithm to cluster n objectsK means Clustering - algorithm to cluster n objects
K means Clustering - algorithm to cluster n objects
 

Último

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 

Último (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Clustering

  • 2. INTRODUCTION  Clustering is an unsupervised learning method of data abstraction.  The method of identifying similar groups of data in a dataset is called Clustering.  It is basically a collection of objects on the basis of similarity and dissimilarity between them.
  • 3. TYPES OF CLUSTERING  Hard Clustering In hard clustering, each data point either belongs to a cluster completely or not.  Soft Clustering Soft clustering is about grouping the data items such that an item can exists in multiple clusters.
  • 4. CLUSTERING METHODS Density-Based Methods : These method search the data space for areas of varied density of data points in the data space. Hierarchical Based Methods: In this method, the clusters forms a tree-type structure based on the hierarchy New clusters are formed using the previously formed one. It is divided into two category • Agglomerative • Divisive
  • 5. Partitioning Based Methods: These methods partition the objects into k cluster and each partition forms one cluster. example :- K means Grid-Based Methods: In this method, the data space is formulated into a finite number of cells that form a grid-like structure.
  • 6. K Means Clustering  It is an algorithm to group similar elements or data points to cluster.  The number of groups or cluster is represented by k.  It assumes that the object attribute forms a vector space based on features that are already provided.
  • 7. K Means Clustering Algorithm Step 1: First we initialize k points, called means, randomly. Step 2:We categorize each item to its closest mean and we update the mean’s coordinates, which are the averages of the items categorized in that mean so far. Step 3: We repeat the process for a given number of iterations and at the end, we have our clusters.
  • 8. Example of K-means Clustering Let us consider a table Individual Height Weight 1 185 72 2 170 56 3 168 60 4 179 68 5 182 72
  • 9. Step 1: Randomly we choose two centroids for two clusters k1=(185,72) k2=(170,56) Step 2: Now using these centroids we compute Eucledian Distance 3rd point ED=sqrt[(xo-xc)^2+(y0-yc)^2] k1=sqrt[(168-185)^2+(60-72)^2] k1=20.80 k2=sqrt[(168-170)^2+(60-56)^2] k2=4.48 Therefore 3 belongs to k2 Step 3: Calculate new centroid values for k2 k2=[(170+168)/2 , (60+56)/2] k2=(169,58) Individual Height Weight 1 185 72 2 170 56 3 168 60 4 179 68 5 182 72
  • 10. K1={1,4,5} K2={2,3} Individual k1 K2 3 20.80 4.48 4 6.32 14.14 5 2 12.56
  • 11. Hierarchical Clustering  Hierarchical Clustering finds successive clusters using previously established clusters.  No Assumptions on the number of clusters.
  • 12. Agglomerative Hierarchical Clustering  Initially consider every data point as an individual Cluster and at every step, merge the nearest pairs of the cluster. It is a bottom-up method. At first every data set is considered as individual entity or cluster. At every iteration, the clusters merge with different clusters until one cluster is formed.
  • 13. Example of Agglomerative Hierarchical Clustering
  • 14. Divisive Hierarchical Clustering Divisive Hierarchical clustering is precisely the opposite of the Agglomerative Hierarchical clustering. In Divisive Hierarchical clustering, we take into account all of the data points as a single cluster. In every iteration, we separate the data points from the clusters which aren’t comparable. In the end, we are left with N clusters.
  • 15. Example of Divisive Hierarchical Clustering
  • 16. Reference • https://www.edureka.co/data-science-python-certification-course • https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering- and-different-methods-of- clustering/#:~:text=Clustering%20is%20the%20task%20of,and%20assign%20t hem%20into%20clusters • https://www.google.com/amp/s/www.geeksforgeeks.org/clustering-in-machine- learning/amp/ • https://towardsdatascience.com/k-means-clustering-algorithm-applications- evaluation-methods-and-drawbacks-aa03e644b48a • https://www.kdnuggets.com/2019/09/hierarchical-clustering.html • https://towardsdatascience.com/hierarchical-clustering-agglomerative-and- divisive-explained-342e6b20d710 • https://towardsdatascience.com/understanding-the-concept-of-hierarchical- clustering-technique-c6e8243758ec • https://developers.google.com/machine-learning/clustering/overview • https://www.google.com/amp/s/www.geeksforgeeks.org/hierarchical- clustering-in-data-mining/amp/ • https://www.google.com/amp/s/www.geeksforgeeks.org/k-means-clustering- introduction/amp/