SlideShare uma empresa Scribd logo
1 de 72
Baixar para ler offline
Métodos Adaptativos de Minería de Datos y Aprendizaje
              para Flujos de Datos.


                                   Albert Bifet
        LARCA: Laboratori d’Algorismica Relacional, Complexitat i Aprenentatge
                Departament de Llenguatges i Sistemes Informàtics
                        Universitat Politècnica de Catalunya




                          Junio 2009, Santander
Minería de Datos y Aprendizaje para Flujos
   de Datos con Cambio de Concepto

                             Extraer información de
                                 secuencia potencialmente
                                 infinita de data
                                 datos que varian con el
                                 tiempo
                                 usando pocos recursos


                             usando ADWIN
 La Desintegración de la         ADaptive Sliding WINdow:
Persistencia de la Memoria       Ventana deslizante
                                 adaptativa
         1952-54
                                 sin parámetros
      Salvador Dalí
                                                           2 / 29
Minería de Datos y Aprendizaje para Flujos
   de Datos con Cambio de Concepto

                             Extraer información de
                                 secuencia potencialmente
                                 infinita de data
                                 datos que varian con el
                                 tiempo
                                 usando pocos recursos


                             usando ADWIN
 La Desintegración de la         ADaptive Sliding WINdow:
Persistencia de la Memoria       Ventana deslizante
                                 adaptativa
         1952-54
                                 sin parámetros
      Salvador Dalí
                                                           2 / 29
Minería de Datos Masivos


Explosión de Datos en los últimos años

            : 100 millones búsquedas por día

                 : 20 millones transacciones por día
    1,000 millones de transacciones de tarjetas de credito por mes
    3,000 millones de llamadas telefónicas diarias en EUA
    30,000 millones de e-mails diarios, 1,000 millones de SMS
    Tráfico de redes IP: 1,000 millones de paquetes por hora por
    router




                                                                     3 / 29
Minería de Datos Masivos

              Datos Masivos
2007
   Universo Digital: 281 exabytes (mil millones de gigabytes)
   La cantidad de información creada excedió el almacenaje
   disponible por primera vez

Green Computing
   Estudio y práctica de como usar recursos informáticos
   eficientemente.

Algorithmic Efficiency
   Una de las principales maneras de hacer Green Computing


                                                                4 / 29
Minería de Datos Masivos

              Datos Masivos
2007
   Universo Digital: 281 exabytes (mil millones de gigabytes)
   La cantidad de información creada excedió el almacenaje
   disponible por primera vez

Green Computing
   Estudio y práctica de como usar recursos informáticos
   eficientemente.

Algorithmic Efficiency
   Una de las principales maneras de hacer Green Computing


                                                                4 / 29
Minería de Datos Masivos

              Datos Masivos
2007
   Universo Digital: 281 exabytes (mil millones de gigabytes)
   La cantidad de información creada excedió el almacenaje
   disponible por primera vez

Green Computing
   Estudio y práctica de como usar recursos informáticos
   eficientemente.

Algorithmic Efficiency
   Una de las principales maneras de hacer Green Computing


                                                                4 / 29
Minería de Datos Masivos

Koichi Kawana
Simplicidad significa conseguir el máximo efecto con los
mínimos medios.


                 Donald Knuth
                 “... we should make use of the idea of
                 limited resources in our own education.
                 We can all benefit by doing occasional
                 "toy" programs, when artificial
                 restrictions are set up, so that we are
                 forced to push our abilities to the limit. “




                                                                5 / 29
Minería de Datos Masivos

Koichi Kawana
Simplicidad significa conseguir el máximo efecto con los
mínimos medios.


                 Donald Knuth
                 “... we should make use of the idea of
                 limited resources in our own education.
                 We can all benefit by doing occasional
                 "toy" programs, when artificial
                 restrictions are set up, so that we are
                 forced to push our abilities to the limit. “




                                                                5 / 29
Introducción: Data Streams
Data Streams
    Secuencia potencialmente infinita
    Gran cantidad de datos: espacio sublineal
    Gran velocidad de llegada: tiempo sublineal por ejemplo
    Cada vez que un elemento de un data stream se ha procesado,
    se descarta o se archiva

Puzzle: Encontrar números que faltan
    Sea π una permutación of {1, . . . , n}.
    Sea π−1 la permutación π con un
    elemento que falta.
    π−1 [i] llega en orden creciente

Tarea: Determinar el número que falta

                                                                  6 / 29
Introducción: Data Streams
Data Streams
    Secuencia potencialmente infinita
    Gran cantidad de datos: espacio sublineal
    Gran velocidad de llegada: tiempo sublineal por ejemplo
    Cada vez que un elemento de un data stream se ha procesado,
    se descarta o se archiva

Puzzle: Encontrar números que faltan
    Sea π una permutación of {1, . . . , n}.
    Sea π−1 la permutación π con un
    elemento que falta.
    π−1 [i] llega en orden creciente

Tarea: Determinar el número que falta

                                                                  6 / 29
Introducción: Data Streams
Data Streams
    Secuencia potencialmente infinita
    Gran cantidad de datos: espacio sublineal
    Gran velocidad de llegada: tiempo sublineal por ejemplo
    Cada vez que un elemento de un data stream se ha procesado,
    se descarta o se archiva

Puzzle: Encontrar números que faltan
    Sea π una permutación of {1, . . . , n}.
                                                Usar un vector
                                                n-bit para
    Sea π−1 la permutación π con un             memorizar todos
    elemento que falta.
                                                los numeros
    π−1 [i] llega en orden creciente            (espacio O(n) )
Tarea: Determinar el número que falta

                                                                  6 / 29
Introducción: Data Streams
Data Streams
    Secuencia potencialmente infinita
    Gran cantidad de datos: espacio sublineal
    Gran velocidad de llegada: tiempo sublineal por ejemplo
    Cada vez que un elemento de un data stream se ha procesado,
    se descarta o se archiva

Puzzle: Encontrar números que faltan
    Sea π una permutación of {1, . . . , n}.
                                                Data Streams:
    Sea π−1 la permutación π con un             espacio
    elemento que falta.
                                                O(log(n)).
    π−1 [i] llega en orden creciente

Tarea: Determinar el número que falta

                                                                  6 / 29
Introducción: Data Streams
Data Streams
    Secuencia potencialmente infinita
    Gran cantidad de datos: espacio sublineal
    Gran velocidad de llegada: tiempo sublineal por ejemplo
    Cada vez que un elemento de un data stream se ha procesado,
    se descarta o se archiva

Puzzle: Encontrar números que faltan
    Sea π una permutación of {1, . . . , n}.     Almacenar
    Sea π−1 la permutación π con un             n(n + 1)
                                                         − ∑ π−1 [j].
    elemento que falta.                            2       j≤i
    π−1 [i] llega en orden creciente

Tarea: Determinar el número que falta

                                                                  6 / 29
Introducción: Data Streams
Problema
                12, 35, 21, 42, 5, 43, 57, 2, 45, 67
Dados n números no ordenados, encontrar un número que
esté en la mitad superior de la lista ordenada.

              2, 5, 12, 21, 35     42, 43, 45, 57, 67

Algoritmo
Elegir k números aleatoriamente. Devolver el número mayor.

Análisis
    La probabilidad de que la solución sea incorrecta es la
    probabilidad de que todos los k números estén en la mitad
    inferior : (1/2)k
    Para tener probabilidad δ usaremos k = log 1/δ muestras

                                                                7 / 29
Introducción: Data Streams
Problema
                12, 35, 21, 42, 5, 43, 57, 2, 45, 67
Dados n números no ordenados, encontrar un número que
esté en la mitad superior de la lista ordenada.

              2, 5, 12, 21, 35     42, 43, 45, 57, 67

Algoritmo
Elegir k números aleatoriamente. Devolver el número mayor.

Análisis
    La probabilidad de que la solución sea incorrecta es la
    probabilidad de que todos los k números estén en la mitad
    inferior : (1/2)k
    Para tener probabilidad δ usaremos k = log 1/δ muestras

                                                                7 / 29
Introducción: Data Streams
Problema
                12, 35, 21, 42, 5, 43, 57, 2, 45, 67
Dados n números no ordenados, encontrar un número que
esté en la mitad superior de la lista ordenada.

              2, 5, 12, 21, 35     42, 43, 45, 57, 67

Algoritmo
Elegir k números aleatoriamente. Devolver el número mayor.

Análisis
    La probabilidad de que la solución sea incorrecta es la
    probabilidad de que todos los k números estén en la mitad
    inferior : (1/2)k
    Para tener probabilidad δ usaremos k = log 1/δ muestras

                                                                7 / 29
Outline



1   Introduction


2   ADWIN : Concept Drift Mining


3   Hoeffding Adaptive Tree


4   Conclusions




                                   8 / 29
Data Streams


Data Streams
At any time t in the data stream, we would like the per-item
processing time and storage to be simultaneously
O(log k (N, t)).

Approximation algorithms
    Small error rate with high probability
                                                      ˜
    An algorithm (ε, δ )−approximates F if it outputs F for which
        ˜ − F | > εF ] < δ .
    Pr[|F




                                                                    9 / 29
Data Streams Approximation Algorithms
Frequency moments
Frequency moments of a stream A = {a1 , . . . , aN }:
                                      v
                              Fk =   ∑ fik
                                     i=1

where fi is the frequency of i in the sequence, and k ≥ 0
    F0 : number of distinct elements on the sequence
    F1 : length of the sequence
    F2 : self-join size, the repeat rate, or as Gini’s index of
    homogeneity
Sketches can approximate F0 , F1 , F2 in O(log v + log N) space.

     Noga Alon, Yossi Matias, and Mario Szegedy.
     The space complexity of approximation
     the frequency moments. 1996
                                                                   10 / 29
Data Streams Approximation Algorithms


           1011000111 1010101

Sliding Window
We can maintain simple statistics over sliding windows, using
O( 1 log2 N) space, where
   ε
    N is the length of the sliding window
    ε is the accuracy parameter

    M. Datar, A. Gionis, P. Indyk, and R. Motwani.
    Maintaining stream statistics over sliding windows. 2002




                                                                11 / 29
Data Streams Approximation Algorithms


           10110001111 0101011

Sliding Window
We can maintain simple statistics over sliding windows, using
O( 1 log2 N) space, where
   ε
    N is the length of the sliding window
    ε is the accuracy parameter

    M. Datar, A. Gionis, P. Indyk, and R. Motwani.
    Maintaining stream statistics over sliding windows. 2002




                                                                11 / 29
Data Streams Approximation Algorithms


           101100011110 1010111

Sliding Window
We can maintain simple statistics over sliding windows, using
O( 1 log2 N) space, where
   ε
    N is the length of the sliding window
    ε is the accuracy parameter

    M. Datar, A. Gionis, P. Indyk, and R. Motwani.
    Maintaining stream statistics over sliding windows. 2002




                                                                11 / 29
Data Streams Approximation Algorithms


           1011000111101 0101110

Sliding Window
We can maintain simple statistics over sliding windows, using
O( 1 log2 N) space, where
   ε
    N is the length of the sliding window
    ε is the accuracy parameter

    M. Datar, A. Gionis, P. Indyk, and R. Motwani.
    Maintaining stream statistics over sliding windows. 2002




                                                                11 / 29
Data Streams Approximation Algorithms


           10110001111010 1011101

Sliding Window
We can maintain simple statistics over sliding windows, using
O( 1 log2 N) space, where
   ε
    N is the length of the sliding window
    ε is the accuracy parameter

    M. Datar, A. Gionis, P. Indyk, and R. Motwani.
    Maintaining stream statistics over sliding windows. 2002




                                                                11 / 29
Data Streams Approximation Algorithms


           101100011110101 0111010

Sliding Window
We can maintain simple statistics over sliding windows, using
O( 1 log2 N) space, where
   ε
    N is the length of the sliding window
    ε is the accuracy parameter

    M. Datar, A. Gionis, P. Indyk, and R. Motwani.
    Maintaining stream statistics over sliding windows. 2002




                                                                11 / 29
Outline



1   Introduction


2   ADWIN : Concept Drift Mining


3   Hoeffding Adaptive Tree


4   Conclusions




                                   12 / 29
Data Mining Algorithms with Concept Drift

No Concept Drift                 Concept Drift

         DM Algorithm
                                          DM Algorithm
 input                  output    input                    output
     -
           Counter5       -           -                        -

           Counter4                        Static Model

           Counter3                              6

           Counter2
           Counter1                   -
                                          Change Detect.
                                                           




                                                               13 / 29
Data Mining Algorithms with Concept Drift

No Concept Drift                 Concept Drift

         DM Algorithm                     DM Algorithm
 input                  output    input                  output
     -
           Counter5       -           -   Estimator5       -

           Counter4                       Estimator4
           Counter3                       Estimator3
           Counter2                       Estimator2
           Counter1                       Estimator1




                                                            13 / 29
Time Change Detectors and Predictors: A
          General Framework

                                    Estimation
                                    -
xt
     -
         Estimator




                                           14 / 29
Time Change Detectors and Predictors: A
          General Framework

                                          Estimation
                                          -
xt
     -
         Estimator                        Alarm
                     -                    -
                         Change Detect.




                                                  14 / 29
Time Change Detectors and Predictors: A
          General Framework

                                                Estimation
                                                -
xt
     -
         Estimator                              Alarm
                           -                    -
                               Change Detect.
              6
                                 6
                                 ?
         -
                  Memory




                                                        14 / 29
Window Management Models

                      W = 101010110111111

Equal  fixed size              Total window against
subwindows                     subwindow
      1010 1011011 1111
                                      10101011011 1111
[Kifer+ 04]
                               [Gama+ 04]
Equal size adjacent
subwindows                     ADWIN: All Adjacent subwindows

     1010101 1011     1111           1 01010110111111

[Dasu+ 06]


                                                          15 / 29
Window Management Models

                      W = 101010110111111

Equal  fixed size              Total window against
subwindows                     subwindow
      1010 1011011 1111
                                      10101011011 1111
[Kifer+ 04]
                               [Gama+ 04]
Equal size adjacent
subwindows                     ADWIN: All Adjacent subwindows

     1010101 1011     1111           10 1010110111111

[Dasu+ 06]


                                                          15 / 29
Window Management Models

                      W = 101010110111111

Equal  fixed size              Total window against
subwindows                     subwindow
      1010 1011011 1111
                                      10101011011 1111
[Kifer+ 04]
                               [Gama+ 04]
Equal size adjacent
subwindows                     ADWIN: All Adjacent subwindows

     1010101 1011     1111           101 010110111111

[Dasu+ 06]


                                                          15 / 29
Window Management Models

                      W = 101010110111111

Equal  fixed size              Total window against
subwindows                     subwindow
      1010 1011011 1111
                                      10101011011 1111
[Kifer+ 04]
                               [Gama+ 04]
Equal size adjacent
subwindows                     ADWIN: All Adjacent subwindows

     1010101 1011     1111           1010 10110111111

[Dasu+ 06]


                                                          15 / 29
Window Management Models

                      W = 101010110111111

Equal  fixed size              Total window against
subwindows                     subwindow
      1010 1011011 1111
                                      10101011011 1111
[Kifer+ 04]
                               [Gama+ 04]
Equal size adjacent
subwindows                     ADWIN: All Adjacent subwindows

     1010101 1011     1111           10101 0110111111

[Dasu+ 06]


                                                          15 / 29
Window Management Models

                      W = 101010110111111

Equal  fixed size              Total window against
subwindows                     subwindow
      1010 1011011 1111
                                      10101011011 1111
[Kifer+ 04]
                               [Gama+ 04]
Equal size adjacent
subwindows                     ADWIN: All Adjacent subwindows

     1010101 1011     1111           101010 110111111

[Dasu+ 06]


                                                          15 / 29
Window Management Models

                      W = 101010110111111

Equal  fixed size              Total window against
subwindows                     subwindow
      1010 1011011 1111
                                      10101011011 1111
[Kifer+ 04]
                               [Gama+ 04]
Equal size adjacent
subwindows                     ADWIN: All Adjacent subwindows

     1010101 1011     1111           1010101 10111111

[Dasu+ 06]


                                                          15 / 29
Window Management Models

                      W = 101010110111111

Equal  fixed size              Total window against
subwindows                     subwindow
      1010 1011011 1111
                                      10101011011 1111
[Kifer+ 04]
                               [Gama+ 04]
Equal size adjacent
subwindows                     ADWIN: All Adjacent subwindows

     1010101 1011     1111           10101011 0111111

[Dasu+ 06]


                                                          15 / 29
Window Management Models

                      W = 101010110111111

Equal  fixed size              Total window against
subwindows                     subwindow
      1010 1011011 1111
                                      10101011011 1111
[Kifer+ 04]
                               [Gama+ 04]
Equal size adjacent
subwindows                     ADWIN: All Adjacent subwindows

     1010101 1011     1111           101010110 111111

[Dasu+ 06]


                                                          15 / 29
Window Management Models

                      W = 101010110111111

Equal  fixed size              Total window against
subwindows                     subwindow
      1010 1011011 1111
                                      10101011011 1111
[Kifer+ 04]
                               [Gama+ 04]
Equal size adjacent
subwindows                     ADWIN: All Adjacent subwindows

     1010101 1011     1111           1010101101 11111

[Dasu+ 06]


                                                          15 / 29
Window Management Models

                      W = 101010110111111

Equal  fixed size              Total window against
subwindows                     subwindow
      1010 1011011 1111
                                      10101011011 1111
[Kifer+ 04]
                               [Gama+ 04]
Equal size adjacent
subwindows                     ADWIN: All Adjacent subwindows

     1010101 1011     1111           10101011011 1111

[Dasu+ 06]


                                                          15 / 29
Window Management Models

                      W = 101010110111111

Equal  fixed size              Total window against
subwindows                     subwindow
      1010 1011011 1111
                                      10101011011 1111
[Kifer+ 04]
                               [Gama+ 04]
Equal size adjacent
subwindows                     ADWIN: All Adjacent subwindows

     1010101 1011     1111           101010110111 111

[Dasu+ 06]


                                                          15 / 29
Window Management Models

                      W = 101010110111111

Equal  fixed size              Total window against
subwindows                     subwindow
      1010 1011011 1111
                                      10101011011 1111
[Kifer+ 04]
                               [Gama+ 04]
Equal size adjacent
subwindows                     ADWIN: All Adjacent subwindows

     1010101 1011     1111           1010101101111 11

[Dasu+ 06]


                                                          15 / 29
Window Management Models

                      W = 101010110111111

Equal  fixed size              Total window against
subwindows                     subwindow
      1010 1011011 1111
                                      10101011011 1111
[Kifer+ 04]
                               [Gama+ 04]
Equal size adjacent
subwindows                     ADWIN: All Adjacent subwindows

     1010101 1011     1111           10101011011111 1

[Dasu+ 06]                     11


                                                          15 / 29
Algorithm ADWIN
Example
W = 101010110111111
W0 = 1



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |µW0 − µW1 | ≥ εc holds
                     ˆ     ˆ
6               for every split of W into W = W0 · W1
7                   ˆ
            Output µW



                                                            16 / 29
Algorithm ADWIN
Example
W = 101010110111111
W0 = 1 W1 = 01010110111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |µW0 − µW1 | ≥ εc holds
                     ˆ     ˆ
6               for every split of W into W = W0 · W1
7                   ˆ
            Output µW



                                                            16 / 29
Algorithm ADWIN
Example
W = 101010110111111
W0 = 10 W1 = 1010110111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |µW0 − µW1 | ≥ εc holds
                     ˆ     ˆ
6               for every split of W into W = W0 · W1
7                   ˆ
            Output µW



                                                            16 / 29
Algorithm ADWIN
Example
W = 101010110111111
W0 = 101 W1 = 010110111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |µW0 − µW1 | ≥ εc holds
                     ˆ     ˆ
6               for every split of W into W = W0 · W1
7                   ˆ
            Output µW



                                                            16 / 29
Algorithm ADWIN
Example
W = 101010110111111
W0 = 1010 W1 = 10110111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |µW0 − µW1 | ≥ εc holds
                     ˆ     ˆ
6               for every split of W into W = W0 · W1
7                   ˆ
            Output µW



                                                            16 / 29
Algorithm ADWIN
Example
W = 101010110111111
W0 = 10101 W1 = 0110111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |µW0 − µW1 | ≥ εc holds
                     ˆ     ˆ
6               for every split of W into W = W0 · W1
7                   ˆ
            Output µW



                                                            16 / 29
Algorithm ADWIN
Example
W = 101010110111111
W0 = 101010 W1 = 110111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |µW0 − µW1 | ≥ εc holds
                     ˆ     ˆ
6               for every split of W into W = W0 · W1
7                   ˆ
            Output µW



                                                            16 / 29
Algorithm ADWIN
Example
W = 101010110111111
W0 = 1010101 W1 = 10111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |µW0 − µW1 | ≥ εc holds
                     ˆ     ˆ
6               for every split of W into W = W0 · W1
7                   ˆ
            Output µW



                                                            16 / 29
Algorithm ADWIN
Example
W = 101010110111111
W0 = 10101011 W1 = 0111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |µW0 − µW1 | ≥ εc holds
                     ˆ     ˆ
6               for every split of W into W = W0 · W1
7                   ˆ
            Output µW



                                                            16 / 29
Algorithm ADWIN
Example
W = 101010110111111 |µW0 − µW1 | ≥ εc : CHANGE DET.!
                      ˆ    ˆ
W0 = 101010110 W1 = 111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |µW0 − µW1 | ≥ εc holds
                     ˆ     ˆ
6               for every split of W into W = W0 · W1
7                   ˆ
            Output µW



                                                            16 / 29
Algorithm ADWIN
Example
W = 101010110111111 Drop elements from the tail of W
W0 = 101010110 W1 = 111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |µW0 − µW1 | ≥ εc holds
                     ˆ     ˆ
6               for every split of W into W = W0 · W1
7                   ˆ
            Output µW



                                                            16 / 29
Algorithm ADWIN
Example
W = 01010110111111 Drop elements from the tail of W
W0 = 101010110 W1 = 111111



ADWIN: A DAPTIVE W INDOWING A LGORITHM
1 Initialize Window W
2 for each t  0
3       do W ← W ∪ {xt } (i.e., add xt to the head of W )
4           repeat Drop elements from the tail of W
5             until |µW0 − µW1 | ≥ εc holds
                     ˆ     ˆ
6               for every split of W into W = W0 · W1
7                   ˆ
            Output µW



                                                            16 / 29
Algorithm ADWIN [BG07]



ADWIN has rigorous guarantees (theorems)
    On ratio of false positives
    On ratio of false negatives
    On the relation of the size of the current window and change
    rates

Other methods in the literature: [Gama+ 04], [Widmer+ 96],
[Last 02] don’t provide rigorous guarantees.




                                                                   17 / 29
Algorithm ADWIN [BG07]

Theorem
At every time step we have:
 1   (Few false positives guarantee) If µt remains constant within W ,
     the probability that ADWIN shrinks the window at this step is at
     most δ .
 2   (Few false negatives guarantee) If for any partition W in two
     parts W0 W1 (where W1 contains the most recent items) we have
     |µW0 − µW1 |  ε, and if

                               3 max{µW0 , µW1 } 4n
                      ε ≥ 4·                    ln
                                  min{n0 , n1 }    δ

     then with probability 1 − δ ADWIN shrinks W to W1 , or shorter.



                                                                         18 / 29
Outline



1   Introduction


2   ADWIN : Concept Drift Mining


3   Hoeffding Adaptive Tree


4   Conclusions




                                   19 / 29
Classification
                    Example
                    Contains   Domain      Has        Time
Data set that       “Money”     type      attach.   received   spam
describes e-mail      yes       com         yes       night     yes
features for          yes        edu        no        night     yes
deciding if it is     no        com         yes       night     yes
spam.                 no         edu        no         day      no
                      no        com         no         day      no
                      yes        cat        no         day      yes


    Assume we have to classify the following new instance:
     Contains Domain        Has         Time
     “Money”    type      attach. received spam
       yes       edu        yes          day       ?


                                                                20 / 29
Classification
Assume we have to classify the following new instance:
 Contains Domain         Has         Time
  “Money”    type      attach. received spam
    yes       edu         yes         day        ?




                                                         20 / 29
Decision Trees




Basic induction strategy:
    A ← the “best” decision attribute for next node
    Assign A as decision attribute for node
    For each value of A, create new descendant of node
    Sort training examples to leaf nodes
    If training examples perfectly classified, Then STOP, Else iterate
    over new leaf nodes



                                                                        21 / 29
Hoeffding Tree / CVFDT

Hoeffding Tree : VFDT

    Pedro Domingos and Geoff Hulten.
    Mining high-speed data streams. 2000

   With high probability, constructs an identical model that a
   traditional (greedy) method would learn
   With theoretical guarantees on the error rate




                                                                 22 / 29
VFDT / CVFDT

Concept-adapting Very Fast Decision Trees: CVFDT

    G. Hulten, L. Spencer, and P. Domingos.
    Mining time-changing data streams. 2001

   It keeps its model consistent with a sliding window of examples
   Construct “alternative branches” as preparation for changes
   If the alternative branch becomes more accurate, switch of tree
   branches occurs




                                                                     23 / 29
Decision Trees: CVFDT




No theoretical guarantees on the error rate of CVFDT
CVFDT parameters :
 1   W : is the example window size.
 2   T0 : number of examples used to check at each node if the
     splitting attribute is still the best.
 3   T1 : number of examples used to build the alternate tree.
 4   T2 : number of examples used to test the accuracy of the
     alternate tree.


                                                                 24 / 29
Decision Trees: Hoeffding Adaptive Tree


Hoeffding Adaptive Tree:
     replace frequency statistics counters by estimators
         don’t need a window to store examples, due to the fact that we
         maintain the statistics data needed with estimators

     change the way of checking the substitution of alternate
     subtrees, using a change detector with theoretical guarantees

Summary:
 1   Theoretical guarantees
 2   No Parameters




                                                                          25 / 29
What is MOA?

{M}assive {O}nline {A}nalysis is a framework for online learning
from data streams.




    It is closely related to WEKA
    It includes a collection of offline and online as well as tools for
    evaluation:
         boosting and bagging
         Hoeffding Trees
    with and without Naïve Bayes classifiers at the leaves.




                                                                         26 / 29
Ensemble Methods
      http://www.cs.waikato.ac.nz/∼abifet/MOA/




New ensemble methods:
   ADWIN bagging: When a change is detected, the worst classifier
   is removed and a new classifier is added.
   Adaptive-Size Hoeffding Tree bagging

                                                                   27 / 29
Outline



1   Introduction


2   ADWIN : Concept Drift Mining


3   Hoeffding Adaptive Tree


4   Conclusions




                                   28 / 29
Conclusions

Adaptive and parameter-free methods based in
     replace frequency statistics counters by ADWIN
         don’t need a window to store examples, due to the fact that we
         maintain the statistics data needed with ADWINs

     using ADWIN as change detector with theoretical guarantees,

Summary:
 1   Theoretical guarantees
 2   No parameters needed
 3   Higher accuracy
 4   Less space needed



                                                                          29 / 29

Mais conteúdo relacionado

Destaque

Métodos predictivos y Descriptivos - MINERÍA DE DATOS
Métodos predictivos y Descriptivos - MINERÍA DE DATOSMétodos predictivos y Descriptivos - MINERÍA DE DATOS
Métodos predictivos y Descriptivos - MINERÍA DE DATOSlalopg
 
Mineria de Datos
Mineria de DatosMineria de Datos
Mineria de Datos04071977
 
Puerto Duro en recuperación
Puerto Duro en recuperaciónPuerto Duro en recuperación
Puerto Duro en recuperaciónJorge A Castro K
 
Aprendizaje Organizacional 2.2
Aprendizaje Organizacional 2.2Aprendizaje Organizacional 2.2
Aprendizaje Organizacional 2.2guest7e09b59
 
Liste des distributeurs iso hemp
Liste des distributeurs iso hempListe des distributeurs iso hemp
Liste des distributeurs iso hempGuillaume Boulanger
 
Acheter Nike Shox R3 Moins Cher QS1392
Acheter Nike Shox R3 Moins Cher QS1392Acheter Nike Shox R3 Moins Cher QS1392
Acheter Nike Shox R3 Moins Cher QS1392subsequentfurvo52
 
Fnsip retour sur 30 ans d’existence !
Fnsip   retour sur 30 ans d’existence !Fnsip   retour sur 30 ans d’existence !
Fnsip retour sur 30 ans d’existence !Réseau Pro Santé
 
Un pasteur met des mots sur les maux de l'époque !
Un pasteur met des mots sur les maux de l'époque !Un pasteur met des mots sur les maux de l'époque !
Un pasteur met des mots sur les maux de l'époque !Georges Schell
 
Introduction à Microsoft Dynamics CRM
Introduction à Microsoft Dynamics CRMIntroduction à Microsoft Dynamics CRM
Introduction à Microsoft Dynamics CRMSandrine Zecler
 
Presentación - Somos Más - Junio 2013
Presentación - Somos Más - Junio 2013Presentación - Somos Más - Junio 2013
Presentación - Somos Más - Junio 2013Somos Más
 
Maitrisez votre dette_technique
Maitrisez votre dette_techniqueMaitrisez votre dette_technique
Maitrisez votre dette_techniqueNicolas JOZWIAK
 
Café Numérique - Les technologies au service du commerce - WSLLabs
Café Numérique - Les technologies au service du commerce - WSLLabsCafé Numérique - Les technologies au service du commerce - WSLLabs
Café Numérique - Les technologies au service du commerce - WSLLabsSam Piroton
 
Organizaciones, participación, y nuevos medios
Organizaciones, participación, y nuevos mediosOrganizaciones, participación, y nuevos medios
Organizaciones, participación, y nuevos mediosSomos Más
 

Destaque (20)

Métodos predictivos y Descriptivos - MINERÍA DE DATOS
Métodos predictivos y Descriptivos - MINERÍA DE DATOSMétodos predictivos y Descriptivos - MINERÍA DE DATOS
Métodos predictivos y Descriptivos - MINERÍA DE DATOS
 
Mineria De Datos
Mineria De DatosMineria De Datos
Mineria De Datos
 
Mineria de Datos
Mineria de DatosMineria de Datos
Mineria de Datos
 
mineria de datos
mineria de datosmineria de datos
mineria de datos
 
Voeux 2014
Voeux 2014Voeux 2014
Voeux 2014
 
Puerto Duro en recuperación
Puerto Duro en recuperaciónPuerto Duro en recuperación
Puerto Duro en recuperación
 
Aprendizaje Organizacional 2.2
Aprendizaje Organizacional 2.2Aprendizaje Organizacional 2.2
Aprendizaje Organizacional 2.2
 
Liste des distributeurs iso hemp
Liste des distributeurs iso hempListe des distributeurs iso hemp
Liste des distributeurs iso hemp
 
Acheter Nike Shox R3 Moins Cher QS1392
Acheter Nike Shox R3 Moins Cher QS1392Acheter Nike Shox R3 Moins Cher QS1392
Acheter Nike Shox R3 Moins Cher QS1392
 
Amigotes1
Amigotes1Amigotes1
Amigotes1
 
Fnsip retour sur 30 ans d’existence !
Fnsip   retour sur 30 ans d’existence !Fnsip   retour sur 30 ans d’existence !
Fnsip retour sur 30 ans d’existence !
 
Un pasteur met des mots sur les maux de l'époque !
Un pasteur met des mots sur les maux de l'époque !Un pasteur met des mots sur les maux de l'époque !
Un pasteur met des mots sur les maux de l'époque !
 
Enelcaminoaprendi
EnelcaminoaprendiEnelcaminoaprendi
Enelcaminoaprendi
 
Introduction à Microsoft Dynamics CRM
Introduction à Microsoft Dynamics CRMIntroduction à Microsoft Dynamics CRM
Introduction à Microsoft Dynamics CRM
 
5 claves del exito avg
5 claves del exito avg5 claves del exito avg
5 claves del exito avg
 
El geniodelalampara
El geniodelalamparaEl geniodelalampara
El geniodelalampara
 
Presentación - Somos Más - Junio 2013
Presentación - Somos Más - Junio 2013Presentación - Somos Más - Junio 2013
Presentación - Somos Más - Junio 2013
 
Maitrisez votre dette_technique
Maitrisez votre dette_techniqueMaitrisez votre dette_technique
Maitrisez votre dette_technique
 
Café Numérique - Les technologies au service du commerce - WSLLabs
Café Numérique - Les technologies au service du commerce - WSLLabsCafé Numérique - Les technologies au service du commerce - WSLLabs
Café Numérique - Les technologies au service du commerce - WSLLabs
 
Organizaciones, participación, y nuevos medios
Organizaciones, participación, y nuevos mediosOrganizaciones, participación, y nuevos medios
Organizaciones, participación, y nuevos medios
 

Semelhante a Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos.

Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data StreamsMining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data StreamsAlbert Bifet
 
A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream MiningAlbert Bifet
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data ScienceAlbert Bifet
 
Less Is More: Novel Approaches to MySQL Compression for Modern Data Sets - Pe...
Less Is More: Novel Approaches to MySQL Compression for Modern Data Sets - Pe...Less Is More: Novel Approaches to MySQL Compression for Modern Data Sets - Pe...
Less Is More: Novel Approaches to MySQL Compression for Modern Data Sets - Pe...Ernie Souhrada
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data MiningR A Akerkar
 
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Universitat Politècnica de Catalunya
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)eXascale Infolab
 
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Universitat Politècnica de Catalunya
 
Linking data without common identifiers
Linking data without common identifiersLinking data without common identifiers
Linking data without common identifiersLars Marius Garshol
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache CassandraEric Evans
 
Open Science Data Cloud (June 21, 2010)
Open Science Data Cloud (June 21, 2010)Open Science Data Cloud (June 21, 2010)
Open Science Data Cloud (June 21, 2010)Robert Grossman
 
TIM: Large-scale Energy Forecasting in Julia
TIM: Large-scale Energy Forecasting in JuliaTIM: Large-scale Energy Forecasting in Julia
TIM: Large-scale Energy Forecasting in JuliaGapData Institute
 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614Sri Ambati
 
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache CassandraEric Evans
 
Processing biggish data on commodity hardware: simple Python patterns
Processing biggish data on commodity hardware: simple Python patternsProcessing biggish data on commodity hardware: simple Python patterns
Processing biggish data on commodity hardware: simple Python patternsGael Varoquaux
 
Publishing consuming Linked Sensor Data meetup Cuenca
Publishing consuming Linked Sensor Data meetup CuencaPublishing consuming Linked Sensor Data meetup Cuenca
Publishing consuming Linked Sensor Data meetup CuencaJean-Paul Calbimonte
 
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup  - Alex PerrierLarge data with Scikit-learn - Boston Data Mining Meetup  - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup - Alex PerrierAlexis Perrier
 

Semelhante a Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos. (20)

Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data StreamsMining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
 
A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream Mining
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
 
18 Data Streams
18 Data Streams18 Data Streams
18 Data Streams
 
Less Is More: Novel Approaches to MySQL Compression for Modern Data Sets - Pe...
Less Is More: Novel Approaches to MySQL Compression for Modern Data Sets - Pe...Less Is More: Novel Approaches to MySQL Compression for Modern Data Sets - Pe...
Less Is More: Novel Approaches to MySQL Compression for Modern Data Sets - Pe...
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data Mining
 
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
 
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
 
Linking data without common identifiers
Linking data without common identifiersLinking data without common identifiers
Linking data without common identifiers
 
PyData Paris 2015 - Closing keynote Francesc Alted
PyData Paris 2015 - Closing keynote Francesc AltedPyData Paris 2015 - Closing keynote Francesc Alted
PyData Paris 2015 - Closing keynote Francesc Alted
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
 
Open Science Data Cloud (June 21, 2010)
Open Science Data Cloud (June 21, 2010)Open Science Data Cloud (June 21, 2010)
Open Science Data Cloud (June 21, 2010)
 
TIM: Large-scale Energy Forecasting in Julia
TIM: Large-scale Energy Forecasting in JuliaTIM: Large-scale Energy Forecasting in Julia
TIM: Large-scale Energy Forecasting in Julia
 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614
 
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Time Series Data with Apache Cassandra
Time Series Data with Apache CassandraTime Series Data with Apache Cassandra
Time Series Data with Apache Cassandra
 
Processing biggish data on commodity hardware: simple Python patterns
Processing biggish data on commodity hardware: simple Python patternsProcessing biggish data on commodity hardware: simple Python patterns
Processing biggish data on commodity hardware: simple Python patterns
 
Publishing consuming Linked Sensor Data meetup Cuenca
Publishing consuming Linked Sensor Data meetup CuencaPublishing consuming Linked Sensor Data meetup Cuenca
Publishing consuming Linked Sensor Data meetup Cuenca
 
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup  - Alex PerrierLarge data with Scikit-learn - Boston Data Mining Meetup  - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
 

Mais de Albert Bifet

Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream miningAlbert Bifet
 
MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 Albert Bifet
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAAlbert Bifet
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersAlbert Bifet
 
Apache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache FlinkApache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache FlinkAlbert Bifet
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data ScienceAlbert Bifet
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAlbert Bifet
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data ManagementAlbert Bifet
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsAlbert Bifet
 
Multi-label Classification with Meta-labels
Multi-label Classification with Meta-labelsMulti-label Classification with Meta-labels
Multi-label Classification with Meta-labelsAlbert Bifet
 
Pitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid themPitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid themAlbert Bifet
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.Albert Bifet
 
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsAlbert Bifet
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real TimeAlbert Bifet
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real TimeAlbert Bifet
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsAlbert Bifet
 
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and SolutionsPAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and SolutionsAlbert Bifet
 
Sentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming DataSentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming DataAlbert Bifet
 
Leveraging Bagging for Evolving Data Streams
Leveraging Bagging for Evolving Data StreamsLeveraging Bagging for Evolving Data Streams
Leveraging Bagging for Evolving Data StreamsAlbert Bifet
 
Fast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data StreamsFast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data StreamsAlbert Bifet
 

Mais de Albert Bifet (20)

Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream mining
 
MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream Classifiers
 
Apache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache FlinkApache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache Flink
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream Analytics
 
Multi-label Classification with Meta-labels
Multi-label Classification with Meta-labelsMulti-label Classification with Meta-labels
Multi-label Classification with Meta-labels
 
Pitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid themPitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid them
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.
 
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive Windows
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data Streams
 
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and SolutionsPAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
 
Sentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming DataSentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming Data
 
Leveraging Bagging for Evolving Data Streams
Leveraging Bagging for Evolving Data StreamsLeveraging Bagging for Evolving Data Streams
Leveraging Bagging for Evolving Data Streams
 
Fast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data StreamsFast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data Streams
 

Último

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 

Último (20)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 

Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos.

  • 1. Métodos Adaptativos de Minería de Datos y Aprendizaje para Flujos de Datos. Albert Bifet LARCA: Laboratori d’Algorismica Relacional, Complexitat i Aprenentatge Departament de Llenguatges i Sistemes Informàtics Universitat Politècnica de Catalunya Junio 2009, Santander
  • 2. Minería de Datos y Aprendizaje para Flujos de Datos con Cambio de Concepto Extraer información de secuencia potencialmente infinita de data datos que varian con el tiempo usando pocos recursos usando ADWIN La Desintegración de la ADaptive Sliding WINdow: Persistencia de la Memoria Ventana deslizante adaptativa 1952-54 sin parámetros Salvador Dalí 2 / 29
  • 3. Minería de Datos y Aprendizaje para Flujos de Datos con Cambio de Concepto Extraer información de secuencia potencialmente infinita de data datos que varian con el tiempo usando pocos recursos usando ADWIN La Desintegración de la ADaptive Sliding WINdow: Persistencia de la Memoria Ventana deslizante adaptativa 1952-54 sin parámetros Salvador Dalí 2 / 29
  • 4. Minería de Datos Masivos Explosión de Datos en los últimos años : 100 millones búsquedas por día : 20 millones transacciones por día 1,000 millones de transacciones de tarjetas de credito por mes 3,000 millones de llamadas telefónicas diarias en EUA 30,000 millones de e-mails diarios, 1,000 millones de SMS Tráfico de redes IP: 1,000 millones de paquetes por hora por router 3 / 29
  • 5. Minería de Datos Masivos Datos Masivos 2007 Universo Digital: 281 exabytes (mil millones de gigabytes) La cantidad de información creada excedió el almacenaje disponible por primera vez Green Computing Estudio y práctica de como usar recursos informáticos eficientemente. Algorithmic Efficiency Una de las principales maneras de hacer Green Computing 4 / 29
  • 6. Minería de Datos Masivos Datos Masivos 2007 Universo Digital: 281 exabytes (mil millones de gigabytes) La cantidad de información creada excedió el almacenaje disponible por primera vez Green Computing Estudio y práctica de como usar recursos informáticos eficientemente. Algorithmic Efficiency Una de las principales maneras de hacer Green Computing 4 / 29
  • 7. Minería de Datos Masivos Datos Masivos 2007 Universo Digital: 281 exabytes (mil millones de gigabytes) La cantidad de información creada excedió el almacenaje disponible por primera vez Green Computing Estudio y práctica de como usar recursos informáticos eficientemente. Algorithmic Efficiency Una de las principales maneras de hacer Green Computing 4 / 29
  • 8. Minería de Datos Masivos Koichi Kawana Simplicidad significa conseguir el máximo efecto con los mínimos medios. Donald Knuth “... we should make use of the idea of limited resources in our own education. We can all benefit by doing occasional "toy" programs, when artificial restrictions are set up, so that we are forced to push our abilities to the limit. “ 5 / 29
  • 9. Minería de Datos Masivos Koichi Kawana Simplicidad significa conseguir el máximo efecto con los mínimos medios. Donald Knuth “... we should make use of the idea of limited resources in our own education. We can all benefit by doing occasional "toy" programs, when artificial restrictions are set up, so that we are forced to push our abilities to the limit. “ 5 / 29
  • 10. Introducción: Data Streams Data Streams Secuencia potencialmente infinita Gran cantidad de datos: espacio sublineal Gran velocidad de llegada: tiempo sublineal por ejemplo Cada vez que un elemento de un data stream se ha procesado, se descarta o se archiva Puzzle: Encontrar números que faltan Sea π una permutación of {1, . . . , n}. Sea π−1 la permutación π con un elemento que falta. π−1 [i] llega en orden creciente Tarea: Determinar el número que falta 6 / 29
  • 11. Introducción: Data Streams Data Streams Secuencia potencialmente infinita Gran cantidad de datos: espacio sublineal Gran velocidad de llegada: tiempo sublineal por ejemplo Cada vez que un elemento de un data stream se ha procesado, se descarta o se archiva Puzzle: Encontrar números que faltan Sea π una permutación of {1, . . . , n}. Sea π−1 la permutación π con un elemento que falta. π−1 [i] llega en orden creciente Tarea: Determinar el número que falta 6 / 29
  • 12. Introducción: Data Streams Data Streams Secuencia potencialmente infinita Gran cantidad de datos: espacio sublineal Gran velocidad de llegada: tiempo sublineal por ejemplo Cada vez que un elemento de un data stream se ha procesado, se descarta o se archiva Puzzle: Encontrar números que faltan Sea π una permutación of {1, . . . , n}. Usar un vector n-bit para Sea π−1 la permutación π con un memorizar todos elemento que falta. los numeros π−1 [i] llega en orden creciente (espacio O(n) ) Tarea: Determinar el número que falta 6 / 29
  • 13. Introducción: Data Streams Data Streams Secuencia potencialmente infinita Gran cantidad de datos: espacio sublineal Gran velocidad de llegada: tiempo sublineal por ejemplo Cada vez que un elemento de un data stream se ha procesado, se descarta o se archiva Puzzle: Encontrar números que faltan Sea π una permutación of {1, . . . , n}. Data Streams: Sea π−1 la permutación π con un espacio elemento que falta. O(log(n)). π−1 [i] llega en orden creciente Tarea: Determinar el número que falta 6 / 29
  • 14. Introducción: Data Streams Data Streams Secuencia potencialmente infinita Gran cantidad de datos: espacio sublineal Gran velocidad de llegada: tiempo sublineal por ejemplo Cada vez que un elemento de un data stream se ha procesado, se descarta o se archiva Puzzle: Encontrar números que faltan Sea π una permutación of {1, . . . , n}. Almacenar Sea π−1 la permutación π con un n(n + 1) − ∑ π−1 [j]. elemento que falta. 2 j≤i π−1 [i] llega en orden creciente Tarea: Determinar el número que falta 6 / 29
  • 15. Introducción: Data Streams Problema 12, 35, 21, 42, 5, 43, 57, 2, 45, 67 Dados n números no ordenados, encontrar un número que esté en la mitad superior de la lista ordenada. 2, 5, 12, 21, 35 42, 43, 45, 57, 67 Algoritmo Elegir k números aleatoriamente. Devolver el número mayor. Análisis La probabilidad de que la solución sea incorrecta es la probabilidad de que todos los k números estén en la mitad inferior : (1/2)k Para tener probabilidad δ usaremos k = log 1/δ muestras 7 / 29
  • 16. Introducción: Data Streams Problema 12, 35, 21, 42, 5, 43, 57, 2, 45, 67 Dados n números no ordenados, encontrar un número que esté en la mitad superior de la lista ordenada. 2, 5, 12, 21, 35 42, 43, 45, 57, 67 Algoritmo Elegir k números aleatoriamente. Devolver el número mayor. Análisis La probabilidad de que la solución sea incorrecta es la probabilidad de que todos los k números estén en la mitad inferior : (1/2)k Para tener probabilidad δ usaremos k = log 1/δ muestras 7 / 29
  • 17. Introducción: Data Streams Problema 12, 35, 21, 42, 5, 43, 57, 2, 45, 67 Dados n números no ordenados, encontrar un número que esté en la mitad superior de la lista ordenada. 2, 5, 12, 21, 35 42, 43, 45, 57, 67 Algoritmo Elegir k números aleatoriamente. Devolver el número mayor. Análisis La probabilidad de que la solución sea incorrecta es la probabilidad de que todos los k números estén en la mitad inferior : (1/2)k Para tener probabilidad δ usaremos k = log 1/δ muestras 7 / 29
  • 18. Outline 1 Introduction 2 ADWIN : Concept Drift Mining 3 Hoeffding Adaptive Tree 4 Conclusions 8 / 29
  • 19. Data Streams Data Streams At any time t in the data stream, we would like the per-item processing time and storage to be simultaneously O(log k (N, t)). Approximation algorithms Small error rate with high probability ˜ An algorithm (ε, δ )−approximates F if it outputs F for which ˜ − F | > εF ] < δ . Pr[|F 9 / 29
  • 20. Data Streams Approximation Algorithms Frequency moments Frequency moments of a stream A = {a1 , . . . , aN }: v Fk = ∑ fik i=1 where fi is the frequency of i in the sequence, and k ≥ 0 F0 : number of distinct elements on the sequence F1 : length of the sequence F2 : self-join size, the repeat rate, or as Gini’s index of homogeneity Sketches can approximate F0 , F1 , F2 in O(log v + log N) space. Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximation the frequency moments. 1996 10 / 29
  • 21. Data Streams Approximation Algorithms 1011000111 1010101 Sliding Window We can maintain simple statistics over sliding windows, using O( 1 log2 N) space, where ε N is the length of the sliding window ε is the accuracy parameter M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. 2002 11 / 29
  • 22. Data Streams Approximation Algorithms 10110001111 0101011 Sliding Window We can maintain simple statistics over sliding windows, using O( 1 log2 N) space, where ε N is the length of the sliding window ε is the accuracy parameter M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. 2002 11 / 29
  • 23. Data Streams Approximation Algorithms 101100011110 1010111 Sliding Window We can maintain simple statistics over sliding windows, using O( 1 log2 N) space, where ε N is the length of the sliding window ε is the accuracy parameter M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. 2002 11 / 29
  • 24. Data Streams Approximation Algorithms 1011000111101 0101110 Sliding Window We can maintain simple statistics over sliding windows, using O( 1 log2 N) space, where ε N is the length of the sliding window ε is the accuracy parameter M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. 2002 11 / 29
  • 25. Data Streams Approximation Algorithms 10110001111010 1011101 Sliding Window We can maintain simple statistics over sliding windows, using O( 1 log2 N) space, where ε N is the length of the sliding window ε is the accuracy parameter M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. 2002 11 / 29
  • 26. Data Streams Approximation Algorithms 101100011110101 0111010 Sliding Window We can maintain simple statistics over sliding windows, using O( 1 log2 N) space, where ε N is the length of the sliding window ε is the accuracy parameter M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. 2002 11 / 29
  • 27. Outline 1 Introduction 2 ADWIN : Concept Drift Mining 3 Hoeffding Adaptive Tree 4 Conclusions 12 / 29
  • 28. Data Mining Algorithms with Concept Drift No Concept Drift Concept Drift DM Algorithm DM Algorithm input output input output - Counter5 - - - Counter4 Static Model Counter3 6 Counter2 Counter1 - Change Detect. 13 / 29
  • 29. Data Mining Algorithms with Concept Drift No Concept Drift Concept Drift DM Algorithm DM Algorithm input output input output - Counter5 - - Estimator5 - Counter4 Estimator4 Counter3 Estimator3 Counter2 Estimator2 Counter1 Estimator1 13 / 29
  • 30. Time Change Detectors and Predictors: A General Framework Estimation - xt - Estimator 14 / 29
  • 31. Time Change Detectors and Predictors: A General Framework Estimation - xt - Estimator Alarm - - Change Detect. 14 / 29
  • 32. Time Change Detectors and Predictors: A General Framework Estimation - xt - Estimator Alarm - - Change Detect. 6 6 ? - Memory 14 / 29
  • 33. Window Management Models W = 101010110111111 Equal fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 1 01010110111111 [Dasu+ 06] 15 / 29
  • 34. Window Management Models W = 101010110111111 Equal fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 10 1010110111111 [Dasu+ 06] 15 / 29
  • 35. Window Management Models W = 101010110111111 Equal fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 101 010110111111 [Dasu+ 06] 15 / 29
  • 36. Window Management Models W = 101010110111111 Equal fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 1010 10110111111 [Dasu+ 06] 15 / 29
  • 37. Window Management Models W = 101010110111111 Equal fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 10101 0110111111 [Dasu+ 06] 15 / 29
  • 38. Window Management Models W = 101010110111111 Equal fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 101010 110111111 [Dasu+ 06] 15 / 29
  • 39. Window Management Models W = 101010110111111 Equal fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 1010101 10111111 [Dasu+ 06] 15 / 29
  • 40. Window Management Models W = 101010110111111 Equal fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 10101011 0111111 [Dasu+ 06] 15 / 29
  • 41. Window Management Models W = 101010110111111 Equal fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 101010110 111111 [Dasu+ 06] 15 / 29
  • 42. Window Management Models W = 101010110111111 Equal fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 1010101101 11111 [Dasu+ 06] 15 / 29
  • 43. Window Management Models W = 101010110111111 Equal fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 10101011011 1111 [Dasu+ 06] 15 / 29
  • 44. Window Management Models W = 101010110111111 Equal fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 101010110111 111 [Dasu+ 06] 15 / 29
  • 45. Window Management Models W = 101010110111111 Equal fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 1010101101111 11 [Dasu+ 06] 15 / 29
  • 46. Window Management Models W = 101010110111111 Equal fixed size Total window against subwindows subwindow 1010 1011011 1111 10101011011 1111 [Kifer+ 04] [Gama+ 04] Equal size adjacent subwindows ADWIN: All Adjacent subwindows 1010101 1011 1111 10101011011111 1 [Dasu+ 06] 11 15 / 29
  • 47. Algorithm ADWIN Example W = 101010110111111 W0 = 1 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29
  • 48. Algorithm ADWIN Example W = 101010110111111 W0 = 1 W1 = 01010110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29
  • 49. Algorithm ADWIN Example W = 101010110111111 W0 = 10 W1 = 1010110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29
  • 50. Algorithm ADWIN Example W = 101010110111111 W0 = 101 W1 = 010110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29
  • 51. Algorithm ADWIN Example W = 101010110111111 W0 = 1010 W1 = 10110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29
  • 52. Algorithm ADWIN Example W = 101010110111111 W0 = 10101 W1 = 0110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29
  • 53. Algorithm ADWIN Example W = 101010110111111 W0 = 101010 W1 = 110111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29
  • 54. Algorithm ADWIN Example W = 101010110111111 W0 = 1010101 W1 = 10111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29
  • 55. Algorithm ADWIN Example W = 101010110111111 W0 = 10101011 W1 = 0111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29
  • 56. Algorithm ADWIN Example W = 101010110111111 |µW0 − µW1 | ≥ εc : CHANGE DET.! ˆ ˆ W0 = 101010110 W1 = 111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29
  • 57. Algorithm ADWIN Example W = 101010110111111 Drop elements from the tail of W W0 = 101010110 W1 = 111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29
  • 58. Algorithm ADWIN Example W = 01010110111111 Drop elements from the tail of W W0 = 101010110 W1 = 111111 ADWIN: A DAPTIVE W INDOWING A LGORITHM 1 Initialize Window W 2 for each t 0 3 do W ← W ∪ {xt } (i.e., add xt to the head of W ) 4 repeat Drop elements from the tail of W 5 until |µW0 − µW1 | ≥ εc holds ˆ ˆ 6 for every split of W into W = W0 · W1 7 ˆ Output µW 16 / 29
  • 59. Algorithm ADWIN [BG07] ADWIN has rigorous guarantees (theorems) On ratio of false positives On ratio of false negatives On the relation of the size of the current window and change rates Other methods in the literature: [Gama+ 04], [Widmer+ 96], [Last 02] don’t provide rigorous guarantees. 17 / 29
  • 60. Algorithm ADWIN [BG07] Theorem At every time step we have: 1 (Few false positives guarantee) If µt remains constant within W , the probability that ADWIN shrinks the window at this step is at most δ . 2 (Few false negatives guarantee) If for any partition W in two parts W0 W1 (where W1 contains the most recent items) we have |µW0 − µW1 | ε, and if 3 max{µW0 , µW1 } 4n ε ≥ 4· ln min{n0 , n1 } δ then with probability 1 − δ ADWIN shrinks W to W1 , or shorter. 18 / 29
  • 61. Outline 1 Introduction 2 ADWIN : Concept Drift Mining 3 Hoeffding Adaptive Tree 4 Conclusions 19 / 29
  • 62. Classification Example Contains Domain Has Time Data set that “Money” type attach. received spam describes e-mail yes com yes night yes features for yes edu no night yes deciding if it is no com yes night yes spam. no edu no day no no com no day no yes cat no day yes Assume we have to classify the following new instance: Contains Domain Has Time “Money” type attach. received spam yes edu yes day ? 20 / 29
  • 63. Classification Assume we have to classify the following new instance: Contains Domain Has Time “Money” type attach. received spam yes edu yes day ? 20 / 29
  • 64. Decision Trees Basic induction strategy: A ← the “best” decision attribute for next node Assign A as decision attribute for node For each value of A, create new descendant of node Sort training examples to leaf nodes If training examples perfectly classified, Then STOP, Else iterate over new leaf nodes 21 / 29
  • 65. Hoeffding Tree / CVFDT Hoeffding Tree : VFDT Pedro Domingos and Geoff Hulten. Mining high-speed data streams. 2000 With high probability, constructs an identical model that a traditional (greedy) method would learn With theoretical guarantees on the error rate 22 / 29
  • 66. VFDT / CVFDT Concept-adapting Very Fast Decision Trees: CVFDT G. Hulten, L. Spencer, and P. Domingos. Mining time-changing data streams. 2001 It keeps its model consistent with a sliding window of examples Construct “alternative branches” as preparation for changes If the alternative branch becomes more accurate, switch of tree branches occurs 23 / 29
  • 67. Decision Trees: CVFDT No theoretical guarantees on the error rate of CVFDT CVFDT parameters : 1 W : is the example window size. 2 T0 : number of examples used to check at each node if the splitting attribute is still the best. 3 T1 : number of examples used to build the alternate tree. 4 T2 : number of examples used to test the accuracy of the alternate tree. 24 / 29
  • 68. Decision Trees: Hoeffding Adaptive Tree Hoeffding Adaptive Tree: replace frequency statistics counters by estimators don’t need a window to store examples, due to the fact that we maintain the statistics data needed with estimators change the way of checking the substitution of alternate subtrees, using a change detector with theoretical guarantees Summary: 1 Theoretical guarantees 2 No Parameters 25 / 29
  • 69. What is MOA? {M}assive {O}nline {A}nalysis is a framework for online learning from data streams. It is closely related to WEKA It includes a collection of offline and online as well as tools for evaluation: boosting and bagging Hoeffding Trees with and without Naïve Bayes classifiers at the leaves. 26 / 29
  • 70. Ensemble Methods http://www.cs.waikato.ac.nz/∼abifet/MOA/ New ensemble methods: ADWIN bagging: When a change is detected, the worst classifier is removed and a new classifier is added. Adaptive-Size Hoeffding Tree bagging 27 / 29
  • 71. Outline 1 Introduction 2 ADWIN : Concept Drift Mining 3 Hoeffding Adaptive Tree 4 Conclusions 28 / 29
  • 72. Conclusions Adaptive and parameter-free methods based in replace frequency statistics counters by ADWIN don’t need a window to store examples, due to the fact that we maintain the statistics data needed with ADWINs using ADWIN as change detector with theoretical guarantees, Summary: 1 Theoretical guarantees 2 No parameters needed 3 Higher accuracy 4 Less space needed 29 / 29