SlideShare uma empresa Scribd logo
1 de 39
Baixar para ler offline
K-Means Clustering Problem
            Ahmad Sabiq
          Febri Maspiyanti
       Indah Kuntum Khairina
          Wiwin Farhania
              Yonatan
What is k-means?
• To partition n objects into k clusters, based on
  attributes.
  – Objects of the same cluster are close their
    attributes are related to each other.
  – Objects of different clusters are far apart their
    attributes are very dissimilar.
Algorithm
• Input: n objects, k (integer k ≤ n)
• Output: k clusters
• Steps:
   1. Select k initial centroids.
   2. Calculate the distance between each object and
      each centroid.
   3. Assign each object to the cluster with the nearest
      centroid.
   4. Recalculate each centroid.
   5. If the centroids don’t change, stop (convergence).
      Otherwise, back to step 2.
• Complexity: O(k.n.d.total_iteration)
Initialization
• Why is it important? What does it affect?
  – Clustering result local optimum!
  – Total iteration / complexity
Good Initialization
3 clusters with 2 iterations…
Bad Initialization
3 clusters with 4 iterations…
Initialization Methods
1.   Random
2.   Forgy
3.   Macqueen
4.   Kaufman
Random
• Algorithm:
  1. Assigns each object to a random cluster.
  2. Computes the initial centroid of each cluster.
Random
Random
Random
9
8
7
6
5
4
3
2
1
0
    0   5   10    15   20   25   30   35
Forgy
• Algorithm:
  1. Chooses k objects at random and uses them as the initial
     centroids.
Forgy
9
8
7
6
5
4
3
2
1
0
    0   5   10   15   20   25   30   35
MacQueen
• Algorithm:
  1. Chooses k objects at random and uses them as the initial
     centroids.
  2. Assign each object to the cluster with the nearest
     centroid.
  3. After each assignment, recalculate the centroid.
MacQueen
9
8
7
6
5
4
3
2
1
0
    0   5   10     15   20   25   30   35
MacQueen
MacQueen
MacQueen
MacQueen
MacQueen
MacQueen
MacQueen
MacQueen
MacQueen
Kaufman
Kaufman
Kaufman
Kaufman
Kaufman
Kaufman
Kaufman
Kaufman
Kaufman
                        C=0




d = 24,33

            D = 15,52
Kaufman
          C=0


          C=0   C=0

          C=0




          C=0
Kaufman
                       C=0


                       C=0   C=0

                       C=0



∑C1 = 2,74
                       C=0
Kaufman
                                       ∑C5 = 52,55

                                       ∑C6 = 55,88   ∑C9 = 42,69

                                  ∑C7 = 53,77




∑C1 = 2,74                           ∑C8 = 51,16

         ∑C2 = 12,,21


         ∑C3 = 12,36



        ∑C3 = 8,38
Kaufman
                                       ∑C5 = 52,55

                                       ∑C6 = 55,88   ∑C9 = 42,69

                                  ∑C7 = 53,77




∑C1 = 2,74                           ∑C8 = 51,16

         ∑C2 = 12,,21


         ∑C3 = 12,36



        ∑C3 = 8,38
Reference
1. J.M. Peña, J.A. Lozano, and P. Larrañaga. An Empirical
   Comparison of Four Initialization Methods for the K-
   Means Algorithm. Pattern Recognition Letters, vol. 20,
   pp. 1027–1040. 1999.
2. J.R. Cano, O. Cordón, F. Herrera, and L. Sánchez. A
   Greedy Randomized Adaptive Search Procedure
   Applied to the Clustering Problem as an Initialization
   Process Using K-Means as a Local Search Procedure.
   Journal of Intelligent and Fuzzy Systems, vol. 12, pp.
   235 – 242. 2002.
3. L. Kaufman and P.J. Rousseeuw. Finding Groups in
   Data: An Introduction to Cluster Analysis. Wiley. 1990.
Questions
1. Kenapa inisialisasi penting pada k-means?
2. Metode inisialisasi apa yang memiliki greedy
   choice property?
3. Jelaskan kompleksitas O(nkd) pada metode
   Random.

Mais conteúdo relacionado

Mais procurados

U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image SegmentationU-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentationfake can
 
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...Simplilearn
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clusteringSOYEON KIM
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Syed Atif Naseem
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentationOwin Will
 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesMohammed Bennamoun
 
Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001Md. Minhazul Haque
 
K MEANS CLUSTERING
K MEANS CLUSTERINGK MEANS CLUSTERING
K MEANS CLUSTERINGsingh7599
 
Radial Basis Function Interpolation
Radial Basis Function InterpolationRadial Basis Function Interpolation
Radial Basis Function InterpolationJesse Bettencourt
 
Generating functions solve recurrence
Generating functions solve recurrenceGenerating functions solve recurrence
Generating functions solve recurrenceHae Morgia
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clusteringArshad Farhad
 
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Simplilearn
 
The Mathematics of Neural Networks
The Mathematics of Neural NetworksThe Mathematics of Neural Networks
The Mathematics of Neural Networksm.a.kirn
 
Semi-Supervised Learning
Semi-Supervised LearningSemi-Supervised Learning
Semi-Supervised LearningLukas Tencer
 

Mais procurados (20)

U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image SegmentationU-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation
 
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clustering
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)
 
Kernel Method
Kernel MethodKernel Method
Kernel Method
 
K-Means Algorithm
K-Means AlgorithmK-Means Algorithm
K-Means Algorithm
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rules
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001
 
K-Means manual work
K-Means manual workK-Means manual work
K-Means manual work
 
K MEANS CLUSTERING
K MEANS CLUSTERINGK MEANS CLUSTERING
K MEANS CLUSTERING
 
Radial Basis Function Interpolation
Radial Basis Function InterpolationRadial Basis Function Interpolation
Radial Basis Function Interpolation
 
Image Segmentation
 Image Segmentation Image Segmentation
Image Segmentation
 
Generating functions solve recurrence
Generating functions solve recurrenceGenerating functions solve recurrence
Generating functions solve recurrence
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
DBSCAN
DBSCANDBSCAN
DBSCAN
 
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
Backpropagation And Gradient Descent In Neural Networks | Neural Network Tuto...
 
The Mathematics of Neural Networks
The Mathematics of Neural NetworksThe Mathematics of Neural Networks
The Mathematics of Neural Networks
 
Semi-Supervised Learning
Semi-Supervised LearningSemi-Supervised Learning
Semi-Supervised Learning
 

Destaque

Clustering, k means algorithm
Clustering, k means algorithmClustering, k means algorithm
Clustering, k means algorithmJunyoung Park
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithmparry prabhu
 
PRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of GaussiansPRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of GaussiansShinichi Tamura
 
Kmeans
KmeansKmeans
KmeansWagner
 
The Public Opinion Landscape: Election 2016
The Public Opinion Landscape: Election 2016The Public Opinion Landscape: Election 2016
The Public Opinion Landscape: Election 2016GloverParkGroup
 
广东证券见记者发表
广东证券见记者发表广东证券见记者发表
广东证券见记者发表hanyzeng
 
Маркетинг финансовых услуг - выступление для студентов
Маркетинг финансовых услуг - выступление для студентовМаркетинг финансовых услуг - выступление для студентов
Маркетинг финансовых услуг - выступление для студентовCyril Savitsky
 
Experimental design
Experimental designExperimental design
Experimental designDan Toma
 
سبيلك الى الثروة و النجاح
سبيلك الى الثروة و النجاحسبيلك الى الثروة و النجاح
سبيلك الى الثروة و النجاحMorad Kheloufi Kheloufi
 
Mumbai - Zappos - Downtown Project - Dec 10, 2015
Mumbai - Zappos - Downtown Project - Dec 10, 2015Mumbai - Zappos - Downtown Project - Dec 10, 2015
Mumbai - Zappos - Downtown Project - Dec 10, 2015Delivering Happiness
 
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. KristofWho Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. KristofDongheartwell Dargantes
 
Trulia Metro Movers Report - Winter 2012
Trulia Metro Movers Report - Winter 2012Trulia Metro Movers Report - Winter 2012
Trulia Metro Movers Report - Winter 2012Trulia
 
Historia insp aurora silva
Historia insp   aurora silvaHistoria insp   aurora silva
Historia insp aurora silvaantonio leal
 

Destaque (20)

Kmeans plusplus
Kmeans plusplusKmeans plusplus
Kmeans plusplus
 
Clustering, k means algorithm
Clustering, k means algorithmClustering, k means algorithm
Clustering, k means algorithm
 
K mean-clustering algorithm
K mean-clustering algorithmK mean-clustering algorithm
K mean-clustering algorithm
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
PRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of GaussiansPRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
PRML 9.1-9.2: K-means Clustering & Mixtures of Gaussians
 
Kmeans
KmeansKmeans
Kmeans
 
The Public Opinion Landscape: Election 2016
The Public Opinion Landscape: Election 2016The Public Opinion Landscape: Election 2016
The Public Opinion Landscape: Election 2016
 
Comprension de lectura de los mexicanos
Comprension de lectura de los mexicanosComprension de lectura de los mexicanos
Comprension de lectura de los mexicanos
 
广东证券见记者发表
广东证券见记者发表广东证券见记者发表
广东证券见记者发表
 
 
Zaragoza turismo 243
Zaragoza turismo 243Zaragoza turismo 243
Zaragoza turismo 243
 
Маркетинг финансовых услуг - выступление для студентов
Маркетинг финансовых услуг - выступление для студентовМаркетинг финансовых услуг - выступление для студентов
Маркетинг финансовых услуг - выступление для студентов
 
Experimental design
Experimental designExperimental design
Experimental design
 
سبيلك الى الثروة و النجاح
سبيلك الى الثروة و النجاحسبيلك الى الثروة و النجاح
سبيلك الى الثروة و النجاح
 
Mumbai - Zappos - Downtown Project - Dec 10, 2015
Mumbai - Zappos - Downtown Project - Dec 10, 2015Mumbai - Zappos - Downtown Project - Dec 10, 2015
Mumbai - Zappos - Downtown Project - Dec 10, 2015
 
#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока
#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока
#СтанемБлиже: спецкурс по межкультурной коммуникации с туристами с Востока
 
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. KristofWho Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
Who Needs Love! In Japan, Many Couples Don't- by Nicholas D. Kristof
 
Kmeans
KmeansKmeans
Kmeans
 
Trulia Metro Movers Report - Winter 2012
Trulia Metro Movers Report - Winter 2012Trulia Metro Movers Report - Winter 2012
Trulia Metro Movers Report - Winter 2012
 
Historia insp aurora silva
Historia insp   aurora silvaHistoria insp   aurora silva
Historia insp aurora silva
 

Semelhante a Kmeans initialization

Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsNithyananthSengottai
 
Clustering Theory
Clustering TheoryClustering Theory
Clustering TheorySSA KPI
 
Pattern recognition binoy k means clustering
Pattern recognition binoy  k means clusteringPattern recognition binoy  k means clustering
Pattern recognition binoy k means clustering108kaushik
 
DMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringDMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringPier Luca Lanzi
 
Selection K in K-means Clustering
Selection K in K-means ClusteringSelection K in K-means Clustering
Selection K in K-means ClusteringJunghoon Kim
 
K means clustering algorithm
K means clustering algorithmK means clustering algorithm
K means clustering algorithmDarshak Mehta
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methodsKrish_ver2
 
Mathematics online: some common algorithms
Mathematics online: some common algorithmsMathematics online: some common algorithms
Mathematics online: some common algorithmsMark Moriarty
 
multiarmed bandit.ppt
multiarmed bandit.pptmultiarmed bandit.ppt
multiarmed bandit.pptLPrashanthi
 
DMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clusteringDMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clusteringPier Luca Lanzi
 
Data Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner softwareData Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner softwareMohammed Kharma
 
Clustering_Overview.pptx
Clustering_Overview.pptxClustering_Overview.pptx
Clustering_Overview.pptxnyomans1
 

Semelhante a Kmeans initialization (20)

Advanced database and data mining & clustering concepts
Advanced database and data mining & clustering conceptsAdvanced database and data mining & clustering concepts
Advanced database and data mining & clustering concepts
 
Clustering.pptx
Clustering.pptxClustering.pptx
Clustering.pptx
 
Clustering Theory
Clustering TheoryClustering Theory
Clustering Theory
 
K means-1
K means-1K means-1
K means-1
 
Pattern recognition binoy k means clustering
Pattern recognition binoy  k means clusteringPattern recognition binoy  k means clustering
Pattern recognition binoy k means clustering
 
DMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based ClusteringDMTM 2015 - 08 Representative-Based Clustering
DMTM 2015 - 08 Representative-Based Clustering
 
Selection K in K-means Clustering
Selection K in K-means ClusteringSelection K in K-means Clustering
Selection K in K-means Clustering
 
Data Mining Lecture_7.pptx
Data Mining Lecture_7.pptxData Mining Lecture_7.pptx
Data Mining Lecture_7.pptx
 
K means clustering algorithm
K means clustering algorithmK means clustering algorithm
K means clustering algorithm
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
Mathematics online: some common algorithms
Mathematics online: some common algorithmsMathematics online: some common algorithms
Mathematics online: some common algorithms
 
TunUp final presentation
TunUp final presentationTunUp final presentation
TunUp final presentation
 
multiarmed bandit.ppt
multiarmed bandit.pptmultiarmed bandit.ppt
multiarmed bandit.ppt
 
Knn 160904075605-converted
Knn 160904075605-convertedKnn 160904075605-converted
Knn 160904075605-converted
 
DMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clusteringDMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clustering
 
Clustering
ClusteringClustering
Clustering
 
Bioalgo 2012-03-randomized
Bioalgo 2012-03-randomizedBioalgo 2012-03-randomized
Bioalgo 2012-03-randomized
 
Ch12 randalgs
Ch12 randalgsCh12 randalgs
Ch12 randalgs
 
Data Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner softwareData Mining: Implementation of Data Mining Techniques using RapidMiner software
Data Mining: Implementation of Data Mining Techniques using RapidMiner software
 
Clustering_Overview.pptx
Clustering_Overview.pptxClustering_Overview.pptx
Clustering_Overview.pptx
 

Último

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Último (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Kmeans initialization

  • 1. K-Means Clustering Problem Ahmad Sabiq Febri Maspiyanti Indah Kuntum Khairina Wiwin Farhania Yonatan
  • 2. What is k-means? • To partition n objects into k clusters, based on attributes. – Objects of the same cluster are close their attributes are related to each other. – Objects of different clusters are far apart their attributes are very dissimilar.
  • 3. Algorithm • Input: n objects, k (integer k ≤ n) • Output: k clusters • Steps: 1. Select k initial centroids. 2. Calculate the distance between each object and each centroid. 3. Assign each object to the cluster with the nearest centroid. 4. Recalculate each centroid. 5. If the centroids don’t change, stop (convergence). Otherwise, back to step 2. • Complexity: O(k.n.d.total_iteration)
  • 4. Initialization • Why is it important? What does it affect? – Clustering result local optimum! – Total iteration / complexity
  • 5. Good Initialization 3 clusters with 2 iterations…
  • 6. Bad Initialization 3 clusters with 4 iterations…
  • 7. Initialization Methods 1. Random 2. Forgy 3. Macqueen 4. Kaufman
  • 8. Random • Algorithm: 1. Assigns each object to a random cluster. 2. Computes the initial centroid of each cluster.
  • 11. Random 9 8 7 6 5 4 3 2 1 0 0 5 10 15 20 25 30 35
  • 12. Forgy • Algorithm: 1. Chooses k objects at random and uses them as the initial centroids.
  • 13. Forgy 9 8 7 6 5 4 3 2 1 0 0 5 10 15 20 25 30 35
  • 14. MacQueen • Algorithm: 1. Chooses k objects at random and uses them as the initial centroids. 2. Assign each object to the cluster with the nearest centroid. 3. After each assignment, recalculate the centroid.
  • 15. MacQueen 9 8 7 6 5 4 3 2 1 0 0 5 10 15 20 25 30 35
  • 33. Kaufman C=0 d = 24,33 D = 15,52
  • 34. Kaufman C=0 C=0 C=0 C=0 C=0
  • 35. Kaufman C=0 C=0 C=0 C=0 ∑C1 = 2,74 C=0
  • 36. Kaufman ∑C5 = 52,55 ∑C6 = 55,88 ∑C9 = 42,69 ∑C7 = 53,77 ∑C1 = 2,74 ∑C8 = 51,16 ∑C2 = 12,,21 ∑C3 = 12,36 ∑C3 = 8,38
  • 37. Kaufman ∑C5 = 52,55 ∑C6 = 55,88 ∑C9 = 42,69 ∑C7 = 53,77 ∑C1 = 2,74 ∑C8 = 51,16 ∑C2 = 12,,21 ∑C3 = 12,36 ∑C3 = 8,38
  • 38. Reference 1. J.M. Peña, J.A. Lozano, and P. Larrañaga. An Empirical Comparison of Four Initialization Methods for the K- Means Algorithm. Pattern Recognition Letters, vol. 20, pp. 1027–1040. 1999. 2. J.R. Cano, O. Cordón, F. Herrera, and L. Sánchez. A Greedy Randomized Adaptive Search Procedure Applied to the Clustering Problem as an Initialization Process Using K-Means as a Local Search Procedure. Journal of Intelligent and Fuzzy Systems, vol. 12, pp. 235 – 242. 2002. 3. L. Kaufman and P.J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. Wiley. 1990.
  • 39. Questions 1. Kenapa inisialisasi penting pada k-means? 2. Metode inisialisasi apa yang memiliki greedy choice property? 3. Jelaskan kompleksitas O(nkd) pada metode Random.