SlideShare uma empresa Scribd logo
1 de 26
4.1                          4

•
    •               1
                        10    20     30
                         0.74 0.76 1.34
                                              40
                                               1.75
                 10 2    2.01 2.62 30 0.87
                         20                      40
                                               0.69
                    3
                 0.74    0.87 0.60 1.34
                         0.76         1.83     1.90
                                                 1.75
                    4    1.73 1.83 0.96        0.93
                 2.01    2.62      0.87          0.69
        4.1                                  10     20     30
              40
                 0.87
                    4
                         0.60      1.83 0       1.90
                 1.73    1.83      0.96         0.93




                                                                2
•
    •
•
    •
•
    •
(2)
•
    •
        •
        •




                  4
(3)
•   2
    •
        •
    •
        •
•
    •       (
        •
    •
        •

                      5
x2                               x2
            x                                x

                      dx2 + dx2
                        1     2
                                                 |dx1 | + |dx2 |
      dx2                          dx2
                         y                                y

                dx1                              dx1
                             x1                               x1
(A)                                    (B)
                                                                   6
n
                 i=1 (xi   − x)(yi − y )
                             ¯       ¯
    r=     n                     n
           i=1 (xi   −   x)2
                         ¯       i=1 (yi   − y )2
                                             ¯


y                y                    y




             x                    x                 x
     r≈1             r≈0                   r ≈ −1
                                                        7
•
•
•
•                                                           (Top-down
    Clustering, Divisive Clustering)
                         (Bottom-up Clustering, Agglomerative Clustering)


                  C
              B

                      A


                          F
    G         E
                      D
                              A       B     C   D   E   F   G
        (A)                           (B)
                                  8
k-
•                k
•            d                                                      S
         S                          k                   S1 , S 2 , . . . , S k
                              k-
    •                     S        S = S1 ∪ S2 ∪ · · · ∪ Sk
    •                                         Si ∩ Sj = φ (i = j)
•                3-
                      C                                     C
                 B                                  B

                          A
                                                            A
                                                G
                                                        E
                              F                                     F
        G        E
                          D                                     D

    (A) 3-                           (B) 3-                                      10
k-                                  (2)
 •    k-


      •    (A)
           •            v(Si)
      •    (B)
           •            q(V)


                 ci   = (1/|Si |)            x
(A)                                  x∈Si
                                                             (A)
                           1                       n
           v(Si ) =                      (d(x, ci ))
                          |Si |
                                  x∈Si



(B) diameter(Si ) = max {d(x1 , x2 )|x1 , x2 ∈ Si }
          q(V ) = max {diameter(Si ) | i = 1, . . . , k}
                                                              (B)   11
k-                            (3)
•                2
    •       n,       d         O(n^(O(dn)))


        •
    •                                         k-Means
•
    •                    NP-
    •       2




                                                        12
(Hard) K-means(K-   )

•            k-means
•
•                             K
•
    1.
         •                K
             •                K
             •            K
    2.
    3.
    4. 2, 3
7
                         2                          .
                         2-means                     2


        (0)

              m(1)

                         7         2
                 m(2)                  m(1), m(2)
                         2


(1) k                m
                                                         14
m(1)
                                    x          x
                                 k = arg min{d(m(k) , x)}
                      m(2)                k
                                      k
          x                      x                    m (k)
(2)
                             x                m (k)



                                      d(m(1) , x) > d(m(2) , x)
              +                         x m (2)
                  □
                             d(x,y)
      □       □ □
                                                                  15
+m(1)
          +
              □
              m(2)
                  □
      □   □ □                    m(k)

(3)                                        (n) (n)
                                        n rk x
                             m(k) =
                                         R(k)
                       (n)
                      rk       x(n)                  k

                      R(k)               k




                                                         16
m(1)
              x
      m(2)
                                      x
(4)                           m (2)       m (1)




        + +
          +
              □     (2),(3)
 □      □ □

                                                  17
K-means
            •   5000
            •
            •   0      1   2    3     4   5    6    7     8   9
            0                        21             3     1   7
            1          7   14   1         1     3         4
            2   21          1             1    19         1
            3          6    7    2   3              21    1   14
Cluster #




            4               1   24        21   1          1
            5        37     1    1   17    9   4    6    27   13
            6                         8             8     1    9
            7                   15         6             10
            8   29         22    2        12   23         1
            9               4    5   1              12    3   7    18
K-means
                  •
                       •
                  •        k
                  •
Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge
You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.

            288                                                                           20 — An Example Inf

                      10                                          10                                       Figure 2
                                                                                                           for a cas
                       8                                           8                                       clusters.
                                                                                                           data. (b
                       6                                           6                                       assignm
              (a)                                           (b)                                            four poi
                       4                                           4                                       cluster h
                                                                                                           assigned
                       2                                           2                                       (Points
                                                                                                           cluster a
                       0                                           0
                           0   2   4   6    8    10                    0   2   4     6    8    10

                                                                                                               19
                                                                                                           Figure 2
(1)
        •
112                                                                       4
 (1)                   V                             C
    C                   {}
 (2)                         V         1   c1 ∈ V             C               c1             V

 (3) j = 2, . . . , k     C                (a),(b)  C
     (a)              B        neighbor(x)      B
                                               x(∈ V − C)                                C
                     x neighbor(x)
                           A                         A
                             E                            E
        (b)                      cj F C                                   F

                                               G
                  d(cj , neighbor(cj )) = max {d(x, neighbor(x)) | x ∈ V − C}
                   G
                                  D        x∈V −C                     D
                 (A)               7                (B)           G
            cj

                                                                                                 20
(1)                  V                                          C
   C                  {}
(2)                               V            1       c1 ∈ V                    C             c1       (2) V
(3) j = 2, . . . , k                                      (a),(b)
    (a)                                       neighbor(x)     x(∈ V − C)                                        C
                    x                 neighbor(x)

          (b)                           cj         C

                    d(cj , neighbor(cj )) = max {d(x, neighbor(x)) | x ∈ V − C}
                                                       x∈V −C

            cj
                                                                C                                       C
                      C
                                                        B                                      B
            B                     4.38
                                                                A                                        A
                        A
                E
                                                            E           F                           E
4.10                          F                                                                                 F
                                               G                                       G
  G                                                                 D
                k-        D                                                 NP                              D
(C)                                          (D)                                     (E)
      C                                                                                    D                    k-   21
(2)
                  C                         C
          B                         B

                      A                     A

              E           F             E       F
  G                             G
                      D                     D
(E)                           (F)
      D




                  2


                                                          22
(Self Organizing Map)
•
•
    •
•
•




                                23
Datamining 7th Kmeans
Datamining 7th Kmeans
Datamining 7th Kmeans

Mais conteúdo relacionado

Mais procurados

Slides registration. Vetrovsem
Slides registration. VetrovsemSlides registration. Vetrovsem
Slides registration. VetrovsemValera Vishnevskiy
 
Shape contexts
Shape contextsShape contexts
Shape contextshuebesao
 
Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses
Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin GlassesAnalyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses
Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin GlassesMartin Pelikan
 
Bayesian Inference on a Stochastic Volatility model Using PMCMC methods
Bayesian Inference on a Stochastic Volatility model Using PMCMC methodsBayesian Inference on a Stochastic Volatility model Using PMCMC methods
Bayesian Inference on a Stochastic Volatility model Using PMCMC methodspaperbags
 
Ignou mca mcs 12 solved assignment 2011
Ignou mca mcs 12 solved assignment 2011Ignou mca mcs 12 solved assignment 2011
Ignou mca mcs 12 solved assignment 2011Subeesh Up
 
Ee107 sp 06_mock_test1_q_s_ok_3p_
Ee107 sp 06_mock_test1_q_s_ok_3p_Ee107 sp 06_mock_test1_q_s_ok_3p_
Ee107 sp 06_mock_test1_q_s_ok_3p_Sporsho
 

Mais procurados (11)

Midterm I Review
Midterm I ReviewMidterm I Review
Midterm I Review
 
Test
TestTest
Test
 
Slides registration. Vetrovsem
Slides registration. VetrovsemSlides registration. Vetrovsem
Slides registration. Vetrovsem
 
Shape contexts
Shape contextsShape contexts
Shape contexts
 
Goiken2008 slide01
Goiken2008 slide01Goiken2008 slide01
Goiken2008 slide01
 
Ch2006slide
Ch2006slideCh2006slide
Ch2006slide
 
Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses
Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin GlassesAnalyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses
Analyzing Probabilistic Models in Hierarchical BOA on Traps and Spin Glasses
 
Bayesian Inference on a Stochastic Volatility model Using PMCMC methods
Bayesian Inference on a Stochastic Volatility model Using PMCMC methodsBayesian Inference on a Stochastic Volatility model Using PMCMC methods
Bayesian Inference on a Stochastic Volatility model Using PMCMC methods
 
Lecture9
Lecture9Lecture9
Lecture9
 
Ignou mca mcs 12 solved assignment 2011
Ignou mca mcs 12 solved assignment 2011Ignou mca mcs 12 solved assignment 2011
Ignou mca mcs 12 solved assignment 2011
 
Ee107 sp 06_mock_test1_q_s_ok_3p_
Ee107 sp 06_mock_test1_q_s_ok_3p_Ee107 sp 06_mock_test1_q_s_ok_3p_
Ee107 sp 06_mock_test1_q_s_ok_3p_
 

Destaque

100401 Bioinfoinfra
100401 Bioinfoinfra100401 Bioinfoinfra
100401 Bioinfoinfrasesejun
 
Datamining R 3rd
Datamining R 3rdDatamining R 3rd
Datamining R 3rdsesejun
 
Ohp Seijoen H20 09 Sodosei Kensaku
Ohp Seijoen H20 09 Sodosei KensakuOhp Seijoen H20 09 Sodosei Kensaku
Ohp Seijoen H20 09 Sodosei Kensakusesejun
 
Linguagens Dinamicas - Tech Days 2008
Linguagens Dinamicas - Tech Days 2008Linguagens Dinamicas - Tech Days 2008
Linguagens Dinamicas - Tech Days 2008Alcides Fonseca
 

Destaque (8)

100401 Bioinfoinfra
100401 Bioinfoinfra100401 Bioinfoinfra
100401 Bioinfoinfra
 
Datamining R 3rd
Datamining R 3rdDatamining R 3rd
Datamining R 3rd
 
Ohp Seijoen H20 09 Sodosei Kensaku
Ohp Seijoen H20 09 Sodosei KensakuOhp Seijoen H20 09 Sodosei Kensaku
Ohp Seijoen H20 09 Sodosei Kensaku
 
080807
080807080807
080807
 
Linguagens Dinamicas - Tech Days 2008
Linguagens Dinamicas - Tech Days 2008Linguagens Dinamicas - Tech Days 2008
Linguagens Dinamicas - Tech Days 2008
 
XMPP - Beyond IM
XMPP - Beyond IMXMPP - Beyond IM
XMPP - Beyond IM
 
Programar para GPUs
Programar para GPUsProgramar para GPUs
Programar para GPUs
 
Introdução Web
Introdução WebIntrodução Web
Introdução Web
 

Semelhante a Datamining 7th Kmeans

Datamining 8th hclustering
Datamining 8th hclusteringDatamining 8th hclustering
Datamining 8th hclusteringsesejun
 
Datamining 8th Hclustering
Datamining 8th HclusteringDatamining 8th Hclustering
Datamining 8th Hclusteringsesejun
 
One way to see higher dimensional surface
One way to see higher dimensional surfaceOne way to see higher dimensional surface
One way to see higher dimensional surfaceKenta Oono
 
mathematical_notation
mathematical_notationmathematical_notation
mathematical_notationKenta Oono
 
S101-52國立新化高中(代理)
S101-52國立新化高中(代理)S101-52國立新化高中(代理)
S101-52國立新化高中(代理)yustar1026
 
Graphing Exponentials
Graphing ExponentialsGraphing Exponentials
Graphing Exponentialsteachingfools
 
LISTA DE EXERCÍCIOS - OPERAÇÕES COM NÚMEROS REAIS
LISTA DE EXERCÍCIOS - OPERAÇÕES COM NÚMEROS REAISLISTA DE EXERCÍCIOS - OPERAÇÕES COM NÚMEROS REAIS
LISTA DE EXERCÍCIOS - OPERAÇÕES COM NÚMEROS REAISwillianv
 
Factorising quads diff 2 squares perfect squares
Factorising quads diff 2 squares perfect squaresFactorising quads diff 2 squares perfect squares
Factorising quads diff 2 squares perfect squaresSimon Borgert
 
ตัวอย่างข้อสอบเก่า วิชาคณิตศาสตร์ ม.6 ปีการศึกษา 2553
ตัวอย่างข้อสอบเก่า วิชาคณิตศาสตร์ ม.6 ปีการศึกษา 2553ตัวอย่างข้อสอบเก่า วิชาคณิตศาสตร์ ม.6 ปีการศึกษา 2553
ตัวอย่างข้อสอบเก่า วิชาคณิตศาสตร์ ม.6 ปีการศึกษา 2553Destiny Nooppynuchy
 
Statistics lecture 13 (chapter 13)
Statistics lecture 13 (chapter 13)Statistics lecture 13 (chapter 13)
Statistics lecture 13 (chapter 13)jillmitchell8778
 
経済数学II 「第4章 線型モデルと行列代数」
経済数学II 「第4章 線型モデルと行列代数」経済数学II 「第4章 線型モデルと行列代数」
経済数学II 「第4章 線型モデルと行列代数」Wataru Shito
 
Linear regression
Linear regressionLinear regression
Linear regressionTech_MX
 
集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回Shunta Saito
 
Pc12 sol c04_4-1
Pc12 sol c04_4-1Pc12 sol c04_4-1
Pc12 sol c04_4-1Garden City
 
Solving volumes using cross sectional areas
Solving volumes using cross sectional areasSolving volumes using cross sectional areas
Solving volumes using cross sectional areasgregcross22
 
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์คณิตศาสตร์ 60 เฟรม กาญจนรัตน์
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์krookay2012
 
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์คณิตศาสตร์ 60 เฟรม กาญจนรัตน์
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์krookay2012
 

Semelhante a Datamining 7th Kmeans (20)

Datamining 8th hclustering
Datamining 8th hclusteringDatamining 8th hclustering
Datamining 8th hclustering
 
Datamining 8th Hclustering
Datamining 8th HclusteringDatamining 8th Hclustering
Datamining 8th Hclustering
 
calculo vectorial
calculo vectorialcalculo vectorial
calculo vectorial
 
One way to see higher dimensional surface
One way to see higher dimensional surfaceOne way to see higher dimensional surface
One way to see higher dimensional surface
 
mathematical_notation
mathematical_notationmathematical_notation
mathematical_notation
 
S101-52國立新化高中(代理)
S101-52國立新化高中(代理)S101-52國立新化高中(代理)
S101-52國立新化高中(代理)
 
Graphing Exponentials
Graphing ExponentialsGraphing Exponentials
Graphing Exponentials
 
LISTA DE EXERCÍCIOS - OPERAÇÕES COM NÚMEROS REAIS
LISTA DE EXERCÍCIOS - OPERAÇÕES COM NÚMEROS REAISLISTA DE EXERCÍCIOS - OPERAÇÕES COM NÚMEROS REAIS
LISTA DE EXERCÍCIOS - OPERAÇÕES COM NÚMEROS REAIS
 
Factorising quads diff 2 squares perfect squares
Factorising quads diff 2 squares perfect squaresFactorising quads diff 2 squares perfect squares
Factorising quads diff 2 squares perfect squares
 
ตัวอย่างข้อสอบเก่า วิชาคณิตศาสตร์ ม.6 ปีการศึกษา 2553
ตัวอย่างข้อสอบเก่า วิชาคณิตศาสตร์ ม.6 ปีการศึกษา 2553ตัวอย่างข้อสอบเก่า วิชาคณิตศาสตร์ ม.6 ปีการศึกษา 2553
ตัวอย่างข้อสอบเก่า วิชาคณิตศาสตร์ ม.6 ปีการศึกษา 2553
 
Statistics lecture 13 (chapter 13)
Statistics lecture 13 (chapter 13)Statistics lecture 13 (chapter 13)
Statistics lecture 13 (chapter 13)
 
経済数学II 「第4章 線型モデルと行列代数」
経済数学II 「第4章 線型モデルと行列代数」経済数学II 「第4章 線型モデルと行列代数」
経済数学II 「第4章 線型モデルと行列代数」
 
Linear regression
Linear regressionLinear regression
Linear regression
 
集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回
 
Pc12 sol c04_4-1
Pc12 sol c04_4-1Pc12 sol c04_4-1
Pc12 sol c04_4-1
 
Solving volumes using cross sectional areas
Solving volumes using cross sectional areasSolving volumes using cross sectional areas
Solving volumes using cross sectional areas
 
Cs 601
Cs 601Cs 601
Cs 601
 
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์คณิตศาสตร์ 60 เฟรม กาญจนรัตน์
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์
 
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์คณิตศาสตร์ 60 เฟรม กาญจนรัตน์
คณิตศาสตร์ 60 เฟรม กาญจนรัตน์
 
Pattern6 4
Pattern6 4Pattern6 4
Pattern6 4
 

Mais de sesejun

RNAseqによる変動遺伝子抽出の統計: A Review
RNAseqによる変動遺伝子抽出の統計: A ReviewRNAseqによる変動遺伝子抽出の統計: A Review
RNAseqによる変動遺伝子抽出の統計: A Reviewsesejun
 
バイオインフォマティクスによる遺伝子発現解析
バイオインフォマティクスによる遺伝子発現解析バイオインフォマティクスによる遺伝子発現解析
バイオインフォマティクスによる遺伝子発現解析sesejun
 
次世代シーケンサが求める機械学習
次世代シーケンサが求める機械学習次世代シーケンサが求める機械学習
次世代シーケンサが求める機械学習sesejun
 
20110602labseminar pub
20110602labseminar pub20110602labseminar pub
20110602labseminar pubsesejun
 
20110524zurichngs 2nd pub
20110524zurichngs 2nd pub20110524zurichngs 2nd pub
20110524zurichngs 2nd pubsesejun
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pubsesejun
 
20110214nips2010 read
20110214nips2010 read20110214nips2010 read
20110214nips2010 readsesejun
 
Datamining 9th association_rule.key
Datamining 9th association_rule.keyDatamining 9th association_rule.key
Datamining 9th association_rule.keysesejun
 
Datamining r 4th
Datamining r 4thDatamining r 4th
Datamining r 4thsesejun
 
Datamining r 3rd
Datamining r 3rdDatamining r 3rd
Datamining r 3rdsesejun
 
Datamining r 2nd
Datamining r 2ndDatamining r 2nd
Datamining r 2ndsesejun
 
Datamining r 1st
Datamining r 1stDatamining r 1st
Datamining r 1stsesejun
 
Datamining 5th knn
Datamining 5th knnDatamining 5th knn
Datamining 5th knnsesejun
 
Datamining 4th adaboost
Datamining 4th adaboostDatamining 4th adaboost
Datamining 4th adaboostsesejun
 
Datamining 3rd naivebayes
Datamining 3rd naivebayesDatamining 3rd naivebayes
Datamining 3rd naivebayessesejun
 
Datamining 2nd decisiontree
Datamining 2nd decisiontreeDatamining 2nd decisiontree
Datamining 2nd decisiontreesesejun
 
Datamining 9th Association Rule
Datamining 9th Association RuleDatamining 9th Association Rule
Datamining 9th Association Rulesesejun
 
Datamining 9th Association Rule
Datamining 9th Association RuleDatamining 9th Association Rule
Datamining 9th Association Rulesesejun
 
Datamining 8th Hclustering
Datamining 8th HclusteringDatamining 8th Hclustering
Datamining 8th Hclusteringsesejun
 
Datamining R 4th
Datamining R 4thDatamining R 4th
Datamining R 4thsesejun
 

Mais de sesejun (20)

RNAseqによる変動遺伝子抽出の統計: A Review
RNAseqによる変動遺伝子抽出の統計: A ReviewRNAseqによる変動遺伝子抽出の統計: A Review
RNAseqによる変動遺伝子抽出の統計: A Review
 
バイオインフォマティクスによる遺伝子発現解析
バイオインフォマティクスによる遺伝子発現解析バイオインフォマティクスによる遺伝子発現解析
バイオインフォマティクスによる遺伝子発現解析
 
次世代シーケンサが求める機械学習
次世代シーケンサが求める機械学習次世代シーケンサが求める機械学習
次世代シーケンサが求める機械学習
 
20110602labseminar pub
20110602labseminar pub20110602labseminar pub
20110602labseminar pub
 
20110524zurichngs 2nd pub
20110524zurichngs 2nd pub20110524zurichngs 2nd pub
20110524zurichngs 2nd pub
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
 
20110214nips2010 read
20110214nips2010 read20110214nips2010 read
20110214nips2010 read
 
Datamining 9th association_rule.key
Datamining 9th association_rule.keyDatamining 9th association_rule.key
Datamining 9th association_rule.key
 
Datamining r 4th
Datamining r 4thDatamining r 4th
Datamining r 4th
 
Datamining r 3rd
Datamining r 3rdDatamining r 3rd
Datamining r 3rd
 
Datamining r 2nd
Datamining r 2ndDatamining r 2nd
Datamining r 2nd
 
Datamining r 1st
Datamining r 1stDatamining r 1st
Datamining r 1st
 
Datamining 5th knn
Datamining 5th knnDatamining 5th knn
Datamining 5th knn
 
Datamining 4th adaboost
Datamining 4th adaboostDatamining 4th adaboost
Datamining 4th adaboost
 
Datamining 3rd naivebayes
Datamining 3rd naivebayesDatamining 3rd naivebayes
Datamining 3rd naivebayes
 
Datamining 2nd decisiontree
Datamining 2nd decisiontreeDatamining 2nd decisiontree
Datamining 2nd decisiontree
 
Datamining 9th Association Rule
Datamining 9th Association RuleDatamining 9th Association Rule
Datamining 9th Association Rule
 
Datamining 9th Association Rule
Datamining 9th Association RuleDatamining 9th Association Rule
Datamining 9th Association Rule
 
Datamining 8th Hclustering
Datamining 8th HclusteringDatamining 8th Hclustering
Datamining 8th Hclustering
 
Datamining R 4th
Datamining R 4thDatamining R 4th
Datamining R 4th
 

Datamining 7th Kmeans

  • 1.
  • 2. 4.1 4 • • 1 10 20 30 0.74 0.76 1.34 40 1.75 10 2 2.01 2.62 30 0.87 20 40 0.69 3 0.74 0.87 0.60 1.34 0.76 1.83 1.90 1.75 4 1.73 1.83 0.96 0.93 2.01 2.62 0.87 0.69 4.1 10 20 30 40 0.87 4 0.60 1.83 0 1.90 1.73 1.83 0.96 0.93 2
  • 3. • • • • •
  • 4. (2) • • • • 4
  • 5. (3) • 2 • • • • • • ( • • • 5
  • 6. x2 x2 x x dx2 + dx2 1 2 |dx1 | + |dx2 | dx2 dx2 y y dx1 dx1 x1 x1 (A) (B) 6
  • 7. n i=1 (xi − x)(yi − y ) ¯ ¯ r= n n i=1 (xi − x)2 ¯ i=1 (yi − y )2 ¯ y y y x x x r≈1 r≈0 r ≈ −1 7
  • 8. • • • • (Top-down Clustering, Divisive Clustering) (Bottom-up Clustering, Agglomerative Clustering) C B A F G E D A B C D E F G (A) (B) 8
  • 9.
  • 10. k- • k • d S S k S1 , S 2 , . . . , S k k- • S S = S1 ∪ S2 ∪ · · · ∪ Sk • Si ∩ Sj = φ (i = j) • 3- C C B B A A G E F F G E D D (A) 3- (B) 3- 10
  • 11. k- (2) • k- • (A) • v(Si) • (B) • q(V) ci = (1/|Si |) x (A) x∈Si (A) 1 n v(Si ) = (d(x, ci )) |Si | x∈Si (B) diameter(Si ) = max {d(x1 , x2 )|x1 , x2 ∈ Si } q(V ) = max {diameter(Si ) | i = 1, . . . , k} (B) 11
  • 12. k- (3) • 2 • n, d O(n^(O(dn))) • • k-Means • • NP- • 2 12
  • 13. (Hard) K-means(K- ) • k-means • • K • 1. • K • K • K 2. 3. 4. 2, 3
  • 14. 7 2 . 2-means 2 (0) m(1) 7 2 m(2) m(1), m(2) 2 (1) k m 14
  • 15. m(1) x x k = arg min{d(m(k) , x)} m(2) k k x x m (k) (2) x m (k) d(m(1) , x) > d(m(2) , x) + x m (2) □ d(x,y) □ □ □ 15
  • 16. +m(1) + □ m(2) □ □ □ □ m(k) (3) (n) (n) n rk x m(k) = R(k) (n) rk x(n) k R(k) k 16
  • 17. m(1) x m(2) x (4) m (2) m (1) + + + □ (2),(3) □ □ □ 17
  • 18. K-means • 5000 • • 0 1 2 3 4 5 6 7 8 9 0 21 3 1 7 1 7 14 1 1 3 4 2 21 1 1 19 1 3 6 7 2 3 21 1 14 Cluster # 4 1 24 21 1 1 5 37 1 1 17 9 4 6 27 13 6 8 8 1 9 7 15 6 10 8 29 22 2 12 23 1 9 4 5 1 12 3 7 18
  • 19. K-means • • • k • Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links. 288 20 — An Example Inf 10 10 Figure 2 for a cas 8 8 clusters. data. (b 6 6 assignm (a) (b) four poi 4 4 cluster h assigned 2 2 (Points cluster a 0 0 0 2 4 6 8 10 0 2 4 6 8 10 19 Figure 2
  • 20. (1) • 112 4 (1) V C C {} (2) V 1 c1 ∈ V C c1 V (3) j = 2, . . . , k C (a),(b) C (a) B neighbor(x) B x(∈ V − C) C x neighbor(x) A A E E (b) cj F C F G d(cj , neighbor(cj )) = max {d(x, neighbor(x)) | x ∈ V − C} G D x∈V −C D (A) 7 (B) G cj 20
  • 21. (1) V C C {} (2) V 1 c1 ∈ V C c1 (2) V (3) j = 2, . . . , k (a),(b) (a) neighbor(x) x(∈ V − C) C x neighbor(x) (b) cj C d(cj , neighbor(cj )) = max {d(x, neighbor(x)) | x ∈ V − C} x∈V −C cj C C C B B B 4.38 A A A E E F E 4.10 F F G G G D k- D NP D (C) (D) (E) C D k- 21
  • 22. (2) C C B B A A E F E F G G D D (E) (F) D 2 22