SlideShare uma empresa Scribd logo
1 de 163
Baixar para ler offline
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                 Handling Concept Drift:
            Importance, Challenges & Solutions
          A. Bifet, J. Gama, M. Pechenizkiy, I. Žliobaitė


                                           PAKDD-2011 Tutorial
                                          May 27, Shenzhen, China
                                http://www.cs.waikato.ac.nz/~abifet/PAKDD2011/
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                 Motivation for the Tutorial
 • in the real world data
         – more and more data is organized as data streams
         – data comes from complex environment, and it is
           evolving over time
         – concept drift = underlying distribution of data is
           changing
 • attention to handling concept drift is increasing,
   since predictions become less accurate over time
 • to prevent that, learning models need to adapt
   to changes quickly and accurately

 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                   2
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                    Objectives of the Tutorial
 • many approaches to handle drifts in a wide
   range of applications have been proposed

 • the tutorial aims to provide
 an integrated view of the adaptive learning
   methods and application needs



 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                   3
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                                     Outline
         Block 1: PROBLEM

         Block 2: TECHNIQUES

         Block 3: EVALUATION

         Block 4: APPLICATIONS

         OUTLOOK

 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                   4
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook



                             Presenters & Schedule
Indrė Žliobaitė                                                                 Block 1: what concept
                                                                                drift is, why it needs
                                       14:05 – 14:50
                                                                                to be handled and how



 João Gama
                                       14:50 – 15:40                           Block 2: approaches for
                                                                               adaptive machine learning
                                                                               to handle concept drift



 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                   5
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook



                               Presenters & Schedule
     Albert Bifet
                                                  Block 3: training/evaluation
                                    16:00 – 16:45 of adaptive learning approaches
                                                  and a software demo



Mykola Pechenizkiy
                                                Block 4: challenges for
                                                handling concept drift
                                  16:45 – 17:30
                                                in different applications




   PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                     6
   May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                   7
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                            Block 1: introduction




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                   8
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




 The goals:
 • to introduce the problem of concept drift in
   supervised learning
 • to overview and categorize the main
   principles how concept drift can be (and is)
   handled



 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                   9
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                            Example: production




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 10
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook
                                              many process
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook                                  many process
                                               parameters                                                                                                                parameters
                            Example: production
                                                                                                                                                                                 prediction
                                                                                                                                                         predict quality
             reduced and
             transformed
             parameters
                                                                            t1, t2                                                                                                      q
                                    3
                                                                                                                                                                   79
     monitor                                                                                                                                                       78
                                                                                                                                                                        3S




                                                                                                                                                Konz. (INA&INAL)
                                    2
                                                                                                                                                                        2S
                                                                                                                                                                   77
                                    1                                                                                                                              76
                                                                                                                                                                   75 Target
                             t[2]




                                    0
                                                                                                                                                                   74
                                    -1                                                                                                                             73
                                                                                                                                                                        2S
                                                                                                                                                                   72
                                    -2
                                                                                                                                                                        3S
                                                                                                                                                                             0    100       200   300    400
                                    -3
                                         -8   -7   -6   -5   -4   -3   -2   -1   0      1   2   3   4      5         6         7            8                                           Num
                                                                                 t[1]
                                                                                                        SIMCA-P+ 11 - 13.12.2005 10:17:16




 PAKDD-2011 Tutorial,                                              A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                                                                                        11
 May 27, Shenzhen, China                                           Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook
                                              many process
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook                                  many process
                                               parameters                                                                                                                parameters
                            Example: production
                                                                                                                                                                                 prediction
                                                                                                                                                         predict quality
             reduced and
             transformed
             parameters
                                                                            t1, t2                                                                                                      q
                                    3
                                                                                                                                                                   79
     monitor                                                                                                                                                       78
                                                                                                                                                                        3S




                                                                                                                                                Konz. (INA&INAL)
                                    2
                                                                                                                                                                        2S
                                                                                                                                                                   77
                                    1                                                                                                                              76
                                                                                                                                                                   75 Target
                             t[2]




                                    0
                                                                                                                                                                   74
                                    -1                                                                                                                             73
                                                                                                                                                                        2S
                                                                                                                                                                   72
                                    -2
                                                                                                                                                                        3S
                                                                                                                                                                             0    100       200   300    400
                                    -3
                                         -8   -7   -6   -5   -4   -3   -2   -1   0      1   2   3   4      5         6         7            8                                           Num
                                                                                 t[1]
                                                                                                        SIMCA-P+ 11 - 13.12.2005 10:17:16




                                                                                                                                                after 2-3 months
                                                                                                                                                predictions get
                                                                                                                                                “out of tune”


 PAKDD-2011 Tutorial,                                              A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                                                                                        12
 May 27, Shenzhen, China                                           Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                            Static model


                                                                             Model does not
                                                                                change




                                                   Process changes

                                                   source: Evonik Industries




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 13
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook



 the supplier of raw                                     a sensor breaks/                                    new operating
 material changes                                        “wears off”/                                        crew
                                                         is replaced



                                                                             Model does not
                                                                                change




                                                   Process changes

                                                   source: Evonik Industries

                       new regulations to                                    new production
                       save electricity                                      procedures


 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 14
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook


          Desired Properties of a System To
                Handle Concept Drift
 • Adapt to concept drift asap

 • Distinguish noise from changes
         – robust to noise, but adaptive to changes


 • Recognizing and reacting
   to reoccurring contexts

 • Adapting with limited resources
         – time and memory
 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 15
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                   Setting: adaptive learning




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 16
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                             Supervised Learning
                                                                                         Training:
                                                                                         Learning
                                                      Testing         X'                 a mapping function
           population                                  data

                                                                                         y = L (x)
                                                              2.
                                                                                         Application:
                                                                                         Applying L
                                                                                         to unseen data

  X      Historical data                          Learner                                y' = L (X')
                                1. training


  y           labels                                          2. application



                                                 testing labels ?       y'

 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 17
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                 Learning with concept drift
                                population
                                                                                          Training:

                        = ??                             Testing
                                                          data
                                                                        X'                y = L (x)
             population

                                                                                          Application:

                                                                                          y' = L?? (X')
        X     Historical data                        Learner

        y           labels



                                                    testing labels ?       y'

 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 18
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                         Online setting
                                                                                                  ?




                                                                                                                   time

                                                 Data arrives in a stream




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 19
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                 Adaptive Learning
                                                                                                  ?




                                                                                                                   time


                                                                                                             updated
                                                                                                             model
                as time passes
                the model updates/retrains                                                               prediction
                itself if needed

 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 20
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                         How to Adapt
                            Select the right
                            training data and                                                  or adjust
                            retrain                                                            the model
                                                                                               incrementally
                                                                        ?




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 21
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook


                         The main strategies how
                          to handle concept drift




                                  Images.com




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 22
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




               Adaptive learning strategies
                                          Triggering                                      Evolving




Single classifier
                                                 Detectors                                      Forgetting

                                                 variable windows                                          fixed windows,
                                                                                                           Instance weighting

  Ensemble                                                                               Dynamic
                                       Contextual                                        ensemble

                                                                                                       adaptive
                                                 dynamic integration,
                                                                                                       combination rules
                                                 meta learning

 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 23
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




               Adaptive learning strategies
                                          Triggering                                      Evolving
change detection and                                                                 adapting at every step
a follow up reaction
Single classifier
                                                 Detectors                                      Forgetting

                                                 variable windows                                          fixed windows,
                                                                                                           Instance weighting

  Ensemble                                                                               Dynamic
                                       Contextual                                        ensemble

                                                                                                       adaptive
                                                 dynamic integration,
                                                                                                       combination rules
                                                 meta learning

 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 24
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




               Adaptive learning strategies
     reactive,
                                          Triggering                                      Evolving
     forgetting


Single classifier
                                                 Detectors                                      Forgetting

                                                 variable windows                                          fixed windows,
                                                                                                           Instance weighting

  Ensemble                                                                               Dynamic
                                       Contextual                                        ensemble

    maintain                                                                                           adaptive
                                                 dynamic integration,
    some                                                                                               combination rules
                                                 meta learning
    memory
 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 25
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




               Adaptive learning strategies
                                          Triggering                                      Evolving

                                                                   forget old data and retrain at
                                                                   a fixed rate
Single classifier
                                                 Detectors                                      Forgetting

                                                                                                           fixed windows,
                                                                                                           Instance weighting

  Ensemble                                                                               Dynamic
                                       Contextual                                        ensemble




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 26
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook



                        Fixed Training Window
                                                                                                              time


                                       train                                      predict




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 27
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook



                        Fixed Training Window
                                                                                                              time




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 28
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




               Adaptive learning strategies
                                          Triggering                                      Evolving

                                  detect a change and cut

Single classifier
                                                 Detectors                                      Forgetting

                                                 variable windows


  Ensemble                                                                               Dynamic
                                       Contextual                                        ensemble




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 29
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                   Variable Training Window


                                  detect a change




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 30
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                   Variable Training Window




              drop old data




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 31
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                   Variable Training Window




                                                 retrain the model                            predict
 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 32
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




               Adaptive learning strategies
                                          Triggering                                      Evolving




Single classifier
                                                 Detectors                                      Forgetting




  Ensemble                                                                               Dynamic
                                       Contextual                                        ensemble

                                                                                                       adaptive
                                                        build many models,                             combination rules
                                                        dynamically combine
 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 33
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                Dynamic Ensemble
                                        Classifier 1




                                                       Classifier 2


                                                                                                                             vote
                                Classifier 3




  Classifier 4




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 34
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                Dynamic Ensemble
   voter 1                                                                                                                time



   voter 2




   voter 3




   voter 4

    TRUE
 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 35
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                Dynamic Ensemble
                                            punish
   voter 1                                                   voter 1


                                             reward
   voter 2                                                   voter 2


                                            punish
   voter 3                                                   voter 3


                                             reward
   voter 4                                                   voter 4


    TRUE
 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 36
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                Dynamic Ensemble
                                            punish
   voter 1                                                   voter 1


                                             reward
   voter 2                                                   voter 2


                                            punish
   voter 3                                                   voter 3


                                             reward
   voter 4                                                   voter 4


    TRUE                                                         TRUE
 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 37
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                Dynamic Ensemble
                                            punish
                                                             voter 1                                       voter 1
   voter 1


                                             reward
                                                             voter 2                                       voter 2
   voter 2


                                            punish
                                                             voter 3                                       voter 3
   voter 3


                                             reward
                                                             voter 4                                       voter 4
   voter 4

    TRUE                                                         TRUE
 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 38
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                Adaptive learning strategies
                                           Triggering                                      Evolving




 Single classifier
                                                  Detectors                                      Forgetting




   Ensemble                                                                               Dynamic
                                        Contextual                                        ensemble
build many models,         dynamic integration,
switch models according    meta learning
to the observed incoming data
  PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                  39
  May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




          Contextual (Meta) Approaches

                                                                              partition the training data
                Group 1 = Classifier 1                                        build classifiers




                                                            Group 3 = Classifier 3
     Group 2 = Classifier 2




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 40
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




          Contextual (Meta) Approaches
                                                              find to which partition
                                                              the new instance belongs and
                                                              assign the classifier
                Group 1 = Classifier 1




                                                            Group 3 = Classifier 3
     Group 2 = Classifier 2




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 41
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




 • adaptive learning approaches implicitly or
   explicitly assume some type of change




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 42
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                     Types of Changes: speed

                            mean
     SUDDEN
                                                                                                                         time
                            mean




      GRADUAL

                                                                                                                         time




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                43
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




        Types of Changes: reoccurrence

                                 mean
ALWAYS NEW*
                                                                                                            time
                                 mean




                                                                                                            time
                                 mean




REOCCURING
                                                                                                            time




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 44
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                      Typically Used*
                                          Triggering                                     Evolving
                                          (change detection)                             (adapting every step)
                                                            sudden drift
Single classifier
                                                 Detectors                                     Forgetting

                                                 variable windows                                        fixed windows,
                                                                                                         Instance weighting

  Ensemble                                                                              Dynamic
                                      Contextual                                        ensemble

                                                                                                      adaptive
 * this mapping is                               dynamic integration,
                                                                                                      fusion rules
 not strict or crisp                             meta learning

 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                45
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                        Typically Used
                                          Triggering                                     Evolving
                                          (change detection)                             (adapting every step)


Single classifier
                                                 Detectors                                     Forgetting

                                                 variable windows                                        fixed windows,
                                                                                                         Instance weighting

  Ensemble                                                                              Dynamic
                                      Contextual                                        ensemble

                                                                                                      adaptive
                                                 dynamic integration,
                                                                                                      fusion rules
                                                 meta learning

 PAKDD-2011 Tutorial,
 May 27, Shenzhen, China                                gradual drift                                                           46
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook
 What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                         Typically Used
                                           Triggering                                     Evolving
                                           (change detection)                             (adapting every step)


 Single classifier
                                                  Detectors                                     Forgetting

reoccuring drift                                  variable windows                                        fixed windows,
                                                                                                          Instance weighting

   Ensemble                                                                              Dynamic
                                       Contextual                                        ensemble

                                                                                                       adaptive
                                                  dynamic integration,
                                                                                                       fusion rules
                                                  meta learning

  PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 47
  May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                      Which approach to use?
 • changes occur over time
 • we need models that evolve over time
 • choice of technique depends on
         – what type of change is expected
         – user goals/ applications




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                48
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                      Block summary




                                  Images.com




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 49
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                      Block summary
 • Concept drift is a problem in data stream
   applications, since static models lose accuracy
 • Different adaptive techniques exist
 • The choice of technique depends on
         – what change is expected
         – peculiarities of the task and application




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 50
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




     Block 2: methods and techniques




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 51
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




 The goal:
 to discuss selected popular approaches from
   each of the major types in more detail

         • Data management                                      Forgetting



         • Detection Methods                                                              Detectors


                                                                  Dynamic
         • Adaptation methods                                     ensembles


         • Decision model management                                                     Contextual

 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 52
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook

                   Input Data




          Short term Memory
                Data Management




                                                                 Control
                                                     Learning
                                                         Detection
                                                                 Adaptation


                                                              Memory
                                                          Model Management
                                                           Model Adaptation



 PAKDD-2011 Tutorial,                                                            Output Model                                    53
 May 27, Shenzhen, China
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                Data Management
 • Characterize the information stored in
   memory to maintain a decision model
   consistent with the actual state of the nature.
         – Full Memory.
         – Partial Memory.




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 54
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                              Weighting examples
• Full Memory.
        – Store in memory sufficient statistics over all the examples.
        – Weighting the examples accordingly to their age.
        – Oldest examples are less important.
• Weighting examples based on the age:
                                           wλ ( x) = exp(−λt x )

               where example x was found tx time steps ago




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 55
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook



               Weight as a Function of Age




 PAKDD-2011 Tutorial,
                                                                                                                                 56
 May 27, Shenzhen, China
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                      Partial Memory
 • Partial Memory.
         – Store in memory only the most recent examples.
         – Examples are stored in a first-in first-out data structure.
         – At each time step the learner induces a decision model
           using only the examples that are included in the window.




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 57
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                      Partial Memory
 • The key difficulty is how to select the appropriate
   window size:
    – Small window
                • can assure a fast adaptability in phases with concept
                  changes
                • In more stable phases it can affect the learner
                  performance
         – Large window
                • produce good and stable learning results in stable
                  phases
                • can not react quickly to concept changes.

 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 58
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                 Partial Memory - Windows
 • Fixed Size windows.
         – Store in memory a fixed number of the most recent
           examples.
         – Whenever a new example is available:
                • it is stored in memory and
                • the oldest one is discarded.
         – This is the simplest method to deal with concept drift and
           can be used as a baseline for comparisons.
 • Adaptive Size windows.
         – the set of examples in the window is variable.
         – They are used in conjunction with a detection model.
         – Decreasing the size of the window whenever the detection
           model signals drift and increasing otherwise.

 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 59
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                               Discussion
 • The data management model also indicates the
   forgetting mechanism.
         – Weighting examples:
                • Corresponds to a gradual forgetting.
                • The relevance of old information is less and less important.
         – Time windows
                • Corresponds to abrupt forgetting;
                • The examples are deleted from memory.
         – Of course we can combine both forgetting
           mechanisms by weighting the examples in a time
           window.

 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 60
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                               Discussion
 • These methods are blind adaptation models:
         – There is no explicit change detection
         – Do not provide:
                • Any indication about change points
                • Dynamics of the process generating data




 PAKDD-2011 Tutorial,                     A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 61
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                               Detection Methods
 • The Detection Model characterizes the techniques
   and mechanisms for explicit drift detection.
    – An advantage of the detection model is they can
      provide:
                • Meaningful descriptions:
                   – indicating change-points or
                   – small time-windows where the change occurs
                • Quantification of the changes.




 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 62
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                               Detection Methods
 • Monitoring the evolution of performance
   indicators.
         (See Klinkenberg (98) for a good overview of these
           indicators)
         – Performance measures,
         – Properties of the data, etc
 • Monitoring distributions on two different time-
   windows
         (See D.Kifer, S.Ben-David, J.Gehrke; VLDB 04)
         – A reference window over past examples
         – A window over the most recent examples

 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 63
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook


Monitoring the evolution of performance
               indicators
 • Klinkenberg (1998)proposed monitoring the
   values of three performance indicators:
    – Accuracy, recall and precision over time,
    – and then, comparing it to a confidence interval of
      standard sample errors
    – for a moving average value using the last M batches
      of each particular indicator.
 • FLORA family of algorithms (Widmer and Kubat, 1996)
   monitors accuracy and model size.
    – includes a window adjustment heuristic for a rule-
      based classifier.
    – FLORA3: dealing with recurring concepts.

 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 64
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook


           Monitoring distributions on two
              different time-windows
• D.Kifer, S.Ben-David, J.Gehrke (VLDB 04)
        – Propose algorithms (statistical tests based on Chernoff
          bound) that examine samples drawn from two probability
          distributions and decide whether these distributions are
          different.
• Gama, Fernandes, Rocha: VFDTc (IDA 2006)
        – For streaming data decision tree induction
               • Exploits tree structure: nested trees
        – Continuously monitoring differences between two class-
          distribution of the examples:
               • the distribution when a node was built and the class-distribution
                 when a node was a leaf and
               • the weighted sum of the class-distributions in the leaves
                 descendant of that node.

 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 65
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                     Hoeffding Trees
 • Multiple windows
         – Each node recives information from different windows
                • Root node: oldest data
                • Leaves: most recent data




 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 66
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                      The RS Method
   • Implemented in the VFDTc system (IDA 2006)
   • For each decision node i, compute two estimates
     of the classification error.
          – Static error (SEi):
                 • the distribution of the error at node i ;
          – Backed up error (BUEi):
                 • the sum of the error distributions of all the descending
                   leaves of the node i ;
          – With these two distributions:
                 • we can detect the concept change,
                 • by verifying the condition SEi ≤ BUEi

 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 67
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                              RS Method
• Each new example traverses
  the tree from the root to a leaf
     – At the leaf: update the value of
       SEi
     – It makes the opposite path, and
       update the values of SEi and BUEi
       for each internal node,
     – Verify the regularization
       condition
       SEi ≤ BUEi :
             • If SEi ≤ BUEi then the node i is
               pruned to a new leaf.


 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 68
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 69
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




     ADAPTATION METHODS


 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 70
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                            Adaptation Methods
 • The Adaptation model characterizes the
   changes in the decision model do adapt to the
   most recent examples.
         – Blind Methods:
                • Methods that adapt the learner at regular intervals
                  without considering whether changes have really
                  occurred.
         – Informed Methods:
                • Methods that only change the decision model after a
                  change was detected. They are used in conjunction
                  with a detection model.

 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 71
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




         Granularity of Decision Models
 • Occurrences of drift can have impact in part of the
   instance space.
    – Global models: Require the reconstruction of all
      the decision model.
                • like naive Bayes, SVM, etc
         – Granular decision models: Require the
           reconstruction of parts of the decision model
                • like decision rules, decision trees



 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 72
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook



                                   CVFDT algorithm
• Mining time-changing data streams , G. Hulten, L.
  Spencer, and P. Domingos; ACM SIGKDD 2001

• Process examples from the stream indefinitely.
        – For each example (x, y),
               • Pass (x, y) down to a set of leaves using HT and all alternate trees of
                 the nodes (x, y) passes through.
               • Add (x, y) to the sliding window of examples.
• Periodically scans the internal nodes of HT.
• Start a new alternate tree when a new winning attribute
  is found.
        – Tighter criteria to avoid excessive alternate tree creation.
        – Limit the total number of alternate trees

 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 73
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook



        Smoothly adjust to concept drift
• Alternate trees are grown the
                                                                                                                  Age<30?
  same way HT is.
                                                                                                       Yes                               No
• Periodically each node with
  non-empty alternate trees
  enter a testing mode.                                              Married?                     Car Type=                           Yes
                                                                                                  Sports Car?
      – M training examples to                                Yes                   No
        compare accuracy.                                                                      Yes                No
      – Prune alternate trees with                             Yes                  No                         Experience
                                                                                                  No
        non-increasing accuracy over                                                                            <1 year?
        time.                                                                                           Yes                    No
      – Replace if an alternate tree is
        more accurate.                                                                                    No                 Yes




 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 74
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




             Decision model management
 • Model management characterizes the number
   of decision models needed to maintain in
   memory.
 • The key issue here is the assumption that data
   generated comes from multiple distributions,
         – at least in the transition between contexts.
         – Instead of maintaining a single decision model
           several authors propose the use of multiple
           decision models.
 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 75
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




               Dynamic Weighted Majority
 • A seminal work, is the system presented by
   Kolter and Maloof (ICDM03, ICML05).
 • The Dynamic Weighted Majority algorithm
   (DWM) is an ensemble method for tracking
   concept drift.
         – Maintains an ensemble of base learners,
         – Predicts using a weighted-majority vote of these
           experts.
         – Dynamically creates and deletes experts in response
           to changes in performance.

 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 76
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




     DETECTION ALGORITHMS


 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 77
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




Change Detection in Predictive Learning
 • When there is a change in the class-distribution of the
   examples:
         – The actual model does not correspond any more to the
           actual distribution.
         – The error-rate increases
 • Basic Idea:
         – Learning is a process.
         – Monitor the quality of the learning process:
                • Using Statistical Process Control techniques.
                       – Monitor the evolution of the error rate.
         – Main Problems:
                • How to distinguish Change from Noise?
                • How to React to drift?

 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 78
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




           SPC-Statistical Process Control
 Learning with Drift Detection ,Gama, Medas, Gladys, Rodrigues; SBIA-
    LNCS Springer, 2004.
 • Suppose a sequence of examples in the form < xi,yi>
 • The actual decision model classifies each example in the sequence
    – In the 0-1 loss function, predictions are either True or False
    – The predictions of the learning algorithm are sequences:
      T, F,T, F,T, F,T,T,T, F, …
    – The Error is a random variable from Bernoulli trials.
    – The Binomial distribution gives the general form of the
      probability of observing a F:
                • pi = (#F/i)
                •                             where i is the number of trials.
                   si =    pi (1 − pi ) / i




 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 79
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




Statistical Processing Control Algorithm
    • Maintains two registers:
           – Pmin and Smin such that Pmin + Smin = min(pi + si )
           – Minimum of the error rate taking into account the variance
             of the estimator.
    • At example j : the error of the learning algorithm will be
           – Out-control if pj + sj > pmin + α smin
           – In-control if pj + sj < pmin + β smin
           – Warning Level: if pmin + α smin > pj + sj > pmin + smin
    • The constants α and β depend on the desired
      confidence level.
           – Admissible values are = 2 and = 3.

  PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                  80
  May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




             Statistical Processing Control




 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 81
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




   Illustrative Example: SEA Concepts




• The stream:




 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 82
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook



   The Individual Error of the Models




 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 83
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook



                                          System Error




 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                84
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                               Sequential Analysis
 • Sequential analysis developed a set of
   methods for monitoring change detection:
         – Sequential Probability Ratio Test - SPRT algorithm,
           D.Wald, Sequential Tests of Statistical
           Hypotheses. Annals of Mathematical
           Statistics 16 (2): 117–186; 1945
         – Cumulative Sums - CUSUM: E. Page, Continuous
           Inspection Scheme. Biometrika 41,1954


 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 85
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                               CUSUM Algorithms
         The CUSUM test is used to detect significant increases (or
         decreases) in the successive observations of a random variable x

    • Detecting significant increases: • Detecting decreases:

           – g0 = 0                                                             – g0 = 0
           – gt = max(0; gt-1+ (xt - α))                                        – gt = min(0; gt-1+ (xt - α))
    • The decision rule is:                                              • The decision rule is:
           – if gt > λ then alarm and gt = 0.                                   – if gt < λ then alarm and gt = 0.




 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 86
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                Page-Hinckley Test
 • The PH test is a sequential adaptation of the
   detection of an abrupt change in the average
   of a Gaussian signal.
         – It considers a cumulative variable mT , defined as
           the cumulated difference between the observed
           values and their mean till the current moment:




 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 87
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                        The Page-Hinckley Test


 • The minimum value of mt is also computed:
         – MT = min(mt ; t = 1, …,T).
 • The test monitors the difference between MT and
   mT :
         – PHT = mT - MT .
 • When this difference is greater than a given
   threshold (λ) we alarm a change in the
   distribution.

 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 88
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                              Illustrative Example




        – The left figure plots the on-line error rate of a learning
          algorithm.
        – The center plot is the accumulated on-line error. The slope
          increases after the occurrence of a change.
        – The right plot presents the evolution of the PH statistic.
 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 89
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                Page-Hinckley Test
 • The PHt test is memoryless, and its accuracy
   depends on the choice of parameters α and λ.
    – Both parameters are relevant to control the trade-
      off between earlier detecting true alarms by
      allowing some false alarms.
 • The threshold λ depends on the admissible false
   alarm rate.
    – Increasing λ will entail fewer false alarms, but
      might miss some changes.

 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 90
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




Algorithm Adaptive Window-ADWIN
• Learning from Time-Changing Data with Adaptive Windowing,
  A.Bifet, R.Gavaldà (SDM'07)
• Whenever two “large enough” subwindows of W exhibit
  “distinct enough” averages,
     – we can conclude that the corresponding expected values are different,
       and
     – the older portion of the window is dropped




 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 91
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                                  Algorithm ADWIN
• The value of εcut for a partition W0 . W1 of W is
  computed as follows:
      – Let n0 and n1 be the lengths of W0 and W1 and
      – Let n be the length of W, so n = n0 + n1.
      – Let ηW0 and ηW1 be the averages of the values in W0 and W1,
        and W0 and W1 their expected values.
      – To obtain totally rigorous performance guarantees we define:




 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 92
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




 Characteristics of Algorithm ADWIN
 • False positive rate bound:
         – If ηt remains constant within W, the probability that
           ADWIN shrinks the window at this step is at most δ.
 • False negative rate bound.
         – Suppose that for some partition of W in two parts
           W0W1 (where W1 contains the most recent items) we
           have:
           |η W0 - η W1| > 2*εcut.
           Then with probability 1 - δ ADWIN shrinks W to W1,
           or shorter.

 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 93
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook




                             Illustrative Examples
                   Abrupt Change                                                      Gradual Change




 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 94
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook



             ADWIN – Compressing Space
         •  Using exponential histograms
         • Requires O(log W) space and time
         • It can provide exact counts for all the subwindows

      Example: Counting 1’s in a bit stream
                                                Compress:                      Compress again:
A new 1 arrives:                                There are 3 buckets of size 1: There are 3 buckets of size 2:




                                                  Remove buckets when a drift is detect:
What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook



                                                 Summary
 Dimensions of Analysis:
         • Data management
                • Characterize the information stored in memory to maintain
                  a decision model consistent with the actual state of the
                  nature
         • Detection Methods
                • Characterizes the techniques and mechanisms for explicit
                  drift detection
         • Adaptation methods
                • Characterizes the changes in the decision model do adapt to
                  the most recent examples.
         • Decision model management
                • Characterize the number of decision models needed to
                  maintain in memory.
 PAKDD-2011 Tutorial,                     A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite
                                                                                                                                 96
 May 27, Shenzhen, China                  Handling Concept Drift: Importance, Challenges and Solutions
What	
  CD	
  iis	
  About	
  ::	
  Types	
  of	
  Dri5s	
  &	
  	
  Approaches	
  ::	
  Techniques	
  iin	
  Detail	
  ::	
  Evalua=on	
  :::	
  MOA	
  Demo	
  :::	
  Types	
  of	
  Applica=ons	
  :::	
  Outlook	
  
What	
  CD	
   s	
  About	
  ::	
  Types	
  of	
  Dri5s	
  & Approaches	
  ::	
  Techniques	
   n	
  Detail	
  ::	
  Evalua.on	
   :	
  MOA	
  Demo	
   :	
  Types	
  of	
  Applica=ons	
   :	
  Outlook	
  




  Block	
  3:	
  evalua=on	
  and	
  MOA	
  demo	
  




 PAKDD-­‐2011	
  Tutorial,	
  	
  	
                               A.	
  Bifet,	
  J.Gama,	
  M.	
  Pechenizkiy,	
  I.Zliobaite	
  	
  	
  
                                                                                                                                                                                                                98	
  
 May	
  27,	
  Shenzhen,	
  China	
                                Handling	
  Concept	
  Dri5:	
  Importance,	
  Challenges	
  and	
  Solu=ons	
  
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions
PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions

Mais conteúdo relacionado

Destaque

Retaining wall design_example
Retaining wall design_exampleRetaining wall design_example
Retaining wall design_examplealibunar
 
building construction and material
building construction and materialbuilding construction and material
building construction and materialsuzain ali
 
Issue Management- Public Relations
Issue Management- Public RelationsIssue Management- Public Relations
Issue Management- Public RelationsSiddhant Gupta
 
proper food handling, food safety, and sanitation practices
proper food handling, food safety, and sanitation practicesproper food handling, food safety, and sanitation practices
proper food handling, food safety, and sanitation practicesMichicko Janairo
 
THE BEARING WALL STRUCTURE (Struktur Dinding Pemikul)
THE BEARING WALL STRUCTURE (Struktur Dinding Pemikul)THE BEARING WALL STRUCTURE (Struktur Dinding Pemikul)
THE BEARING WALL STRUCTURE (Struktur Dinding Pemikul)rerianita
 
Seismic Analysis of Structures - III
Seismic Analysis of Structures - IIISeismic Analysis of Structures - III
Seismic Analysis of Structures - IIItushardatta
 
Moa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data StreamsMoa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data StreamsAlbert Bifet
 
Seismic Analysis of Structures - II
Seismic Analysis of Structures - IISeismic Analysis of Structures - II
Seismic Analysis of Structures - IItushardatta
 
Seismic Analysis
Seismic AnalysisSeismic Analysis
Seismic AnalysisKrishnagnr
 
contemporary challenges in management
contemporary challenges in managementcontemporary challenges in management
contemporary challenges in managementIbadat Singh
 
Complaint Handling Keeping Guests Happy.
Complaint Handling  Keeping Guests Happy.Complaint Handling  Keeping Guests Happy.
Complaint Handling Keeping Guests Happy.Vasantkumar Parakhiya
 
Dynamic Analysis with Examples – Seismic Analysis
Dynamic Analysis with Examples – Seismic AnalysisDynamic Analysis with Examples – Seismic Analysis
Dynamic Analysis with Examples – Seismic Analysisopenseesdays
 

Destaque (16)

Rigid frame systems
Rigid frame systemsRigid frame systems
Rigid frame systems
 
Retaining wall design_example
Retaining wall design_exampleRetaining wall design_example
Retaining wall design_example
 
building construction and material
building construction and materialbuilding construction and material
building construction and material
 
Issue Management- Public Relations
Issue Management- Public RelationsIssue Management- Public Relations
Issue Management- Public Relations
 
proper food handling, food safety, and sanitation practices
proper food handling, food safety, and sanitation practicesproper food handling, food safety, and sanitation practices
proper food handling, food safety, and sanitation practices
 
THE BEARING WALL STRUCTURE (Struktur Dinding Pemikul)
THE BEARING WALL STRUCTURE (Struktur Dinding Pemikul)THE BEARING WALL STRUCTURE (Struktur Dinding Pemikul)
THE BEARING WALL STRUCTURE (Struktur Dinding Pemikul)
 
Seismic Analysis of Structures - III
Seismic Analysis of Structures - IIISeismic Analysis of Structures - III
Seismic Analysis of Structures - III
 
Moa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data StreamsMoa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data Streams
 
Seismic Analysis of Structures - II
Seismic Analysis of Structures - IISeismic Analysis of Structures - II
Seismic Analysis of Structures - II
 
Seismic Analysis
Seismic AnalysisSeismic Analysis
Seismic Analysis
 
Load bearing elements part5
Load bearing elements part5Load bearing elements part5
Load bearing elements part5
 
Type of high rise building
Type of high rise buildingType of high rise building
Type of high rise building
 
contemporary challenges in management
contemporary challenges in managementcontemporary challenges in management
contemporary challenges in management
 
Complaint Handling Keeping Guests Happy.
Complaint Handling  Keeping Guests Happy.Complaint Handling  Keeping Guests Happy.
Complaint Handling Keeping Guests Happy.
 
Dynamic Analysis with Examples – Seismic Analysis
Dynamic Analysis with Examples – Seismic AnalysisDynamic Analysis with Examples – Seismic Analysis
Dynamic Analysis with Examples – Seismic Analysis
 
SHEAR WALL
SHEAR WALLSHEAR WALL
SHEAR WALL
 

Semelhante a PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions

Agenda Taller C4D
Agenda Taller C4D Agenda Taller C4D
Agenda Taller C4D UNICEF
 
Principles and Techniques of Test Construction
Principles and Techniques of Test ConstructionPrinciples and Techniques of Test Construction
Principles and Techniques of Test ConstructionOlufemi Jeremiah Olubodun
 
Approaches to changing assessment and feedback practice
Approaches to changing assessment and feedback practiceApproaches to changing assessment and feedback practice
Approaches to changing assessment and feedback practicejisc-elearning
 
M&E completion training report oct 142012
M&E completion training report oct 142012M&E completion training report oct 142012
M&E completion training report oct 142012dr-ayub
 
How to interview customers & design UX across devices (Orange Hive Intro)
How to interview customers & design UX across devices (Orange Hive Intro)How to interview customers & design UX across devices (Orange Hive Intro)
How to interview customers & design UX across devices (Orange Hive Intro)Angela Ognev
 
Doctoral Consortium: Applying Quantified Self Approaches to Support Reflectiv...
Doctoral Consortium: Applying Quantified Self Approaches to Support Reflectiv...Doctoral Consortium: Applying Quantified Self Approaches to Support Reflectiv...
Doctoral Consortium: Applying Quantified Self Approaches to Support Reflectiv...veronicarp
 
Sample Lesson Log on MYDev Life Skills Opening Activity
Sample Lesson Log on MYDev Life Skills Opening ActivitySample Lesson Log on MYDev Life Skills Opening Activity
Sample Lesson Log on MYDev Life Skills Opening ActivityVicente Antofina
 
Introduction to Participatory Monitoring-Evaluation
Introduction to Participatory Monitoring-EvaluationIntroduction to Participatory Monitoring-Evaluation
Introduction to Participatory Monitoring-EvaluationFIDAfrique-IFADAfrica
 
Readiness for direct practice - Using video as a tool to assess Masters socia...
Readiness for direct practice - Using video as a tool to assess Masters socia...Readiness for direct practice - Using video as a tool to assess Masters socia...
Readiness for direct practice - Using video as a tool to assess Masters socia...mdxaltc
 
Formative EvaluationFormative evaluation gives real results as t.docx
Formative EvaluationFormative evaluation gives real results as t.docxFormative EvaluationFormative evaluation gives real results as t.docx
Formative EvaluationFormative evaluation gives real results as t.docxhanneloremccaffery
 
Agile Washington 2015 Creating a Learning Culture
Agile Washington 2015 Creating a Learning CultureAgile Washington 2015 Creating a Learning Culture
Agile Washington 2015 Creating a Learning CultureRenee Troughton
 
Agile Manifesto and Practices Selection for Tailoring Software Development
Agile Manifesto and Practices Selection for Tailoring Software DevelopmentAgile Manifesto and Practices Selection for Tailoring Software Development
Agile Manifesto and Practices Selection for Tailoring Software DevelopmentManuel Kolp
 
Introduction to research techniques
Introduction to research techniquesIntroduction to research techniques
Introduction to research techniquesPranav Koolwal
 
Career Counseling Application Prototype.pptx
Career Counseling Application Prototype.pptxCareer Counseling Application Prototype.pptx
Career Counseling Application Prototype.pptxMee Mee Alainmar
 
Using Repertory Grids as a cross cultural research technique (aka measuring a...
Using Repertory Grids as a cross cultural research technique (aka measuring a...Using Repertory Grids as a cross cultural research technique (aka measuring a...
Using Repertory Grids as a cross cultural research technique (aka measuring a...Anthony Sonego
 

Semelhante a PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions (20)

[Imr]week5
[Imr]week5[Imr]week5
[Imr]week5
 
Agenda Taller C4D
Agenda Taller C4D Agenda Taller C4D
Agenda Taller C4D
 
Principles and Techniques of Test Construction
Principles and Techniques of Test ConstructionPrinciples and Techniques of Test Construction
Principles and Techniques of Test Construction
 
Approaches to changing assessment and feedback practice
Approaches to changing assessment and feedback practiceApproaches to changing assessment and feedback practice
Approaches to changing assessment and feedback practice
 
De 1b report
De 1b reportDe 1b report
De 1b report
 
sem6.pdf
sem6.pdfsem6.pdf
sem6.pdf
 
M&E completion training report oct 142012
M&E completion training report oct 142012M&E completion training report oct 142012
M&E completion training report oct 142012
 
How to interview customers & design UX across devices (Orange Hive Intro)
How to interview customers & design UX across devices (Orange Hive Intro)How to interview customers & design UX across devices (Orange Hive Intro)
How to interview customers & design UX across devices (Orange Hive Intro)
 
Doctoral Consortium: Applying Quantified Self Approaches to Support Reflectiv...
Doctoral Consortium: Applying Quantified Self Approaches to Support Reflectiv...Doctoral Consortium: Applying Quantified Self Approaches to Support Reflectiv...
Doctoral Consortium: Applying Quantified Self Approaches to Support Reflectiv...
 
Sample Lesson Log on MYDev Life Skills Opening Activity
Sample Lesson Log on MYDev Life Skills Opening ActivitySample Lesson Log on MYDev Life Skills Opening Activity
Sample Lesson Log on MYDev Life Skills Opening Activity
 
Introduction to Participatory Monitoring-Evaluation
Introduction to Participatory Monitoring-EvaluationIntroduction to Participatory Monitoring-Evaluation
Introduction to Participatory Monitoring-Evaluation
 
Readiness for direct practice - Using video as a tool to assess Masters socia...
Readiness for direct practice - Using video as a tool to assess Masters socia...Readiness for direct practice - Using video as a tool to assess Masters socia...
Readiness for direct practice - Using video as a tool to assess Masters socia...
 
Formative EvaluationFormative evaluation gives real results as t.docx
Formative EvaluationFormative evaluation gives real results as t.docxFormative EvaluationFormative evaluation gives real results as t.docx
Formative EvaluationFormative evaluation gives real results as t.docx
 
Agile Washington 2015 Creating a Learning Culture
Agile Washington 2015 Creating a Learning CultureAgile Washington 2015 Creating a Learning Culture
Agile Washington 2015 Creating a Learning Culture
 
Ville´s presentation
Ville´s presentationVille´s presentation
Ville´s presentation
 
Article..
Article..Article..
Article..
 
Agile Manifesto and Practices Selection for Tailoring Software Development
Agile Manifesto and Practices Selection for Tailoring Software DevelopmentAgile Manifesto and Practices Selection for Tailoring Software Development
Agile Manifesto and Practices Selection for Tailoring Software Development
 
Introduction to research techniques
Introduction to research techniquesIntroduction to research techniques
Introduction to research techniques
 
Career Counseling Application Prototype.pptx
Career Counseling Application Prototype.pptxCareer Counseling Application Prototype.pptx
Career Counseling Application Prototype.pptx
 
Using Repertory Grids as a cross cultural research technique (aka measuring a...
Using Repertory Grids as a cross cultural research technique (aka measuring a...Using Repertory Grids as a cross cultural research technique (aka measuring a...
Using Repertory Grids as a cross cultural research technique (aka measuring a...
 

Mais de Albert Bifet

Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream miningAlbert Bifet
 
MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 Albert Bifet
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAAlbert Bifet
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersAlbert Bifet
 
Apache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache FlinkApache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache FlinkAlbert Bifet
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data ScienceAlbert Bifet
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAlbert Bifet
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data ScienceAlbert Bifet
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data ManagementAlbert Bifet
 
A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream MiningAlbert Bifet
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsAlbert Bifet
 
Multi-label Classification with Meta-labels
Multi-label Classification with Meta-labelsMulti-label Classification with Meta-labels
Multi-label Classification with Meta-labelsAlbert Bifet
 
Pitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid themPitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid themAlbert Bifet
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.Albert Bifet
 
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsAlbert Bifet
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real TimeAlbert Bifet
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real TimeAlbert Bifet
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsAlbert Bifet
 
Sentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming DataSentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming DataAlbert Bifet
 
Leveraging Bagging for Evolving Data Streams
Leveraging Bagging for Evolving Data StreamsLeveraging Bagging for Evolving Data Streams
Leveraging Bagging for Evolving Data StreamsAlbert Bifet
 

Mais de Albert Bifet (20)

Artificial intelligence and data stream mining
Artificial intelligence and data stream miningArtificial intelligence and data stream mining
Artificial intelligence and data stream mining
 
MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016 MOA for the IoT at ACML 2016
MOA for the IoT at ACML 2016
 
Mining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOAMining Big Data Streams with APACHE SAMOA
Mining Big Data Streams with APACHE SAMOA
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream Classifiers
 
Apache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache FlinkApache Samoa: Mining Big Data Streams with Apache Flink
Apache Samoa: Mining Big Data Streams with Apache Flink
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
 
A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream Mining
 
Real-Time Big Data Stream Analytics
Real-Time Big Data Stream AnalyticsReal-Time Big Data Stream Analytics
Real-Time Big Data Stream Analytics
 
Multi-label Classification with Meta-labels
Multi-label Classification with Meta-labelsMulti-label Classification with Meta-labels
Multi-label Classification with Meta-labels
 
Pitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid themPitfalls in benchmarking data stream classification and how to avoid them
Pitfalls in benchmarking data stream classification and how to avoid them
 
STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.STRIP: stream learning of influence probabilities.
STRIP: stream learning of influence probabilities.
 
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive Windows
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Mining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data StreamsMining Frequent Closed Graphs on Evolving Data Streams
Mining Frequent Closed Graphs on Evolving Data Streams
 
Sentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming DataSentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming Data
 
Leveraging Bagging for Evolving Data Streams
Leveraging Bagging for Evolving Data StreamsLeveraging Bagging for Evolving Data Streams
Leveraging Bagging for Evolving Data Streams
 

Último

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Último (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

PAKDD 2011 TUTORIAL Handling Concept Drift: Importance, Challenges and Solutions

  • 1. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Handling Concept Drift: Importance, Challenges & Solutions A. Bifet, J. Gama, M. Pechenizkiy, I. Žliobaitė PAKDD-2011 Tutorial May 27, Shenzhen, China http://www.cs.waikato.ac.nz/~abifet/PAKDD2011/
  • 2. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Motivation for the Tutorial • in the real world data – more and more data is organized as data streams – data comes from complex environment, and it is evolving over time – concept drift = underlying distribution of data is changing • attention to handling concept drift is increasing, since predictions become less accurate over time • to prevent that, learning models need to adapt to changes quickly and accurately PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 2 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 3. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Objectives of the Tutorial • many approaches to handle drifts in a wide range of applications have been proposed • the tutorial aims to provide an integrated view of the adaptive learning methods and application needs PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 3 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 4. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Outline Block 1: PROBLEM Block 2: TECHNIQUES Block 3: EVALUATION Block 4: APPLICATIONS OUTLOOK PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 4 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 5. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Presenters & Schedule Indrė Žliobaitė Block 1: what concept drift is, why it needs 14:05 – 14:50 to be handled and how João Gama 14:50 – 15:40 Block 2: approaches for adaptive machine learning to handle concept drift PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 5 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 6. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Presenters & Schedule Albert Bifet Block 3: training/evaluation 16:00 – 16:45 of adaptive learning approaches and a software demo Mykola Pechenizkiy Block 4: challenges for handling concept drift 16:45 – 17:30 in different applications PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 6 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 7. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 7 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 8. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Block 1: introduction PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 8 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 9. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook The goals: • to introduce the problem of concept drift in supervised learning • to overview and categorize the main principles how concept drift can be (and is) handled PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 9 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 10. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Example: production PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 10 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 11. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook many process What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook many process parameters parameters Example: production prediction predict quality reduced and transformed parameters t1, t2 q 3 79 monitor 78 3S Konz. (INA&INAL) 2 2S 77 1 76 75 Target t[2] 0 74 -1 73 2S 72 -2 3S 0 100 200 300 400 -3 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 Num t[1] SIMCA-P+ 11 - 13.12.2005 10:17:16 PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 11 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 12. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook many process What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook many process parameters parameters Example: production prediction predict quality reduced and transformed parameters t1, t2 q 3 79 monitor 78 3S Konz. (INA&INAL) 2 2S 77 1 76 75 Target t[2] 0 74 -1 73 2S 72 -2 3S 0 100 200 300 400 -3 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 Num t[1] SIMCA-P+ 11 - 13.12.2005 10:17:16 after 2-3 months predictions get “out of tune” PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 12 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 13. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Static model Model does not change Process changes source: Evonik Industries PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 13 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 14. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook the supplier of raw a sensor breaks/ new operating material changes “wears off”/ crew is replaced Model does not change Process changes source: Evonik Industries new regulations to new production save electricity procedures PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 14 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 15. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Desired Properties of a System To Handle Concept Drift • Adapt to concept drift asap • Distinguish noise from changes – robust to noise, but adaptive to changes • Recognizing and reacting to reoccurring contexts • Adapting with limited resources – time and memory PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 15 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 16. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Setting: adaptive learning PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 16 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 17. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Supervised Learning Training: Learning Testing X' a mapping function population data y = L (x) 2. Application: Applying L to unseen data X Historical data Learner y' = L (X') 1. training y labels 2. application testing labels ? y' PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 17 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 18. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Learning with concept drift population Training: = ?? Testing data X' y = L (x) population Application: y' = L?? (X') X Historical data Learner y labels testing labels ? y' PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 18 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 19. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Online setting ? time Data arrives in a stream PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 19 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 20. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Adaptive Learning ? time updated model as time passes the model updates/retrains prediction itself if needed PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 20 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 21. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook How to Adapt Select the right training data and or adjust retrain the model incrementally ? PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 21 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 22. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook The main strategies how to handle concept drift Images.com PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 22 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 23. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Adaptive learning strategies Triggering Evolving Single classifier Detectors Forgetting variable windows fixed windows, Instance weighting Ensemble Dynamic Contextual ensemble adaptive dynamic integration, combination rules meta learning PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 23 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 24. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Adaptive learning strategies Triggering Evolving change detection and adapting at every step a follow up reaction Single classifier Detectors Forgetting variable windows fixed windows, Instance weighting Ensemble Dynamic Contextual ensemble adaptive dynamic integration, combination rules meta learning PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 24 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 25. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Adaptive learning strategies reactive, Triggering Evolving forgetting Single classifier Detectors Forgetting variable windows fixed windows, Instance weighting Ensemble Dynamic Contextual ensemble maintain adaptive dynamic integration, some combination rules meta learning memory PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 25 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 26. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Adaptive learning strategies Triggering Evolving forget old data and retrain at a fixed rate Single classifier Detectors Forgetting fixed windows, Instance weighting Ensemble Dynamic Contextual ensemble PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 26 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 27. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Fixed Training Window time train predict PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 27 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 28. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Fixed Training Window time PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 28 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 29. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Adaptive learning strategies Triggering Evolving detect a change and cut Single classifier Detectors Forgetting variable windows Ensemble Dynamic Contextual ensemble PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 29 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 30. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Variable Training Window detect a change PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 30 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 31. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Variable Training Window drop old data PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 31 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 32. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Variable Training Window retrain the model predict PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 32 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 33. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Adaptive learning strategies Triggering Evolving Single classifier Detectors Forgetting Ensemble Dynamic Contextual ensemble adaptive build many models, combination rules dynamically combine PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 33 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 34. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Dynamic Ensemble Classifier 1 Classifier 2 vote Classifier 3 Classifier 4 PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 34 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 35. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Dynamic Ensemble voter 1 time voter 2 voter 3 voter 4 TRUE PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 35 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 36. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Dynamic Ensemble punish voter 1 voter 1 reward voter 2 voter 2 punish voter 3 voter 3 reward voter 4 voter 4 TRUE PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 36 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 37. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Dynamic Ensemble punish voter 1 voter 1 reward voter 2 voter 2 punish voter 3 voter 3 reward voter 4 voter 4 TRUE TRUE PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 37 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 38. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Dynamic Ensemble punish voter 1 voter 1 voter 1 reward voter 2 voter 2 voter 2 punish voter 3 voter 3 voter 3 reward voter 4 voter 4 voter 4 TRUE TRUE PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 38 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 39. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Adaptive learning strategies Triggering Evolving Single classifier Detectors Forgetting Ensemble Dynamic Contextual ensemble build many models, dynamic integration, switch models according meta learning to the observed incoming data PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 39 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 40. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Contextual (Meta) Approaches partition the training data Group 1 = Classifier 1 build classifiers Group 3 = Classifier 3 Group 2 = Classifier 2 PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 40 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 41. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Contextual (Meta) Approaches find to which partition the new instance belongs and assign the classifier Group 1 = Classifier 1 Group 3 = Classifier 3 Group 2 = Classifier 2 PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 41 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 42. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook • adaptive learning approaches implicitly or explicitly assume some type of change PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 42 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 43. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Types of Changes: speed mean SUDDEN time mean GRADUAL time PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 43 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 44. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Types of Changes: reoccurrence mean ALWAYS NEW* time mean time mean REOCCURING time PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 44 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 45. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Typically Used* Triggering Evolving (change detection) (adapting every step) sudden drift Single classifier Detectors Forgetting variable windows fixed windows, Instance weighting Ensemble Dynamic Contextual ensemble adaptive * this mapping is dynamic integration, fusion rules not strict or crisp meta learning PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 45 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 46. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Typically Used Triggering Evolving (change detection) (adapting every step) Single classifier Detectors Forgetting variable windows fixed windows, Instance weighting Ensemble Dynamic Contextual ensemble adaptive dynamic integration, fusion rules meta learning PAKDD-2011 Tutorial, May 27, Shenzhen, China gradual drift 46
  • 47. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Typically Used Triggering Evolving (change detection) (adapting every step) Single classifier Detectors Forgetting reoccuring drift variable windows fixed windows, Instance weighting Ensemble Dynamic Contextual ensemble adaptive dynamic integration, fusion rules meta learning PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 47 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 48. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Which approach to use? • changes occur over time • we need models that evolve over time • choice of technique depends on – what type of change is expected – user goals/ applications PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 48 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 49. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Block summary Images.com PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 49 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 50. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Block summary • Concept drift is a problem in data stream applications, since static models lose accuracy • Different adaptive techniques exist • The choice of technique depends on – what change is expected – peculiarities of the task and application PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 50 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 51. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Block 2: methods and techniques PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 51 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 52. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook The goal: to discuss selected popular approaches from each of the major types in more detail • Data management Forgetting • Detection Methods Detectors Dynamic • Adaptation methods ensembles • Decision model management Contextual PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 52 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 53. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Input Data Short term Memory Data Management Control Learning Detection Adaptation Memory Model Management Model Adaptation PAKDD-2011 Tutorial, Output Model 53 May 27, Shenzhen, China
  • 54. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Data Management • Characterize the information stored in memory to maintain a decision model consistent with the actual state of the nature. – Full Memory. – Partial Memory. PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 54 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 55. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Weighting examples • Full Memory. – Store in memory sufficient statistics over all the examples. – Weighting the examples accordingly to their age. – Oldest examples are less important. • Weighting examples based on the age: wλ ( x) = exp(−λt x ) where example x was found tx time steps ago PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 55 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 56. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Weight as a Function of Age PAKDD-2011 Tutorial, 56 May 27, Shenzhen, China
  • 57. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Partial Memory • Partial Memory. – Store in memory only the most recent examples. – Examples are stored in a first-in first-out data structure. – At each time step the learner induces a decision model using only the examples that are included in the window. PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 57 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 58. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Partial Memory • The key difficulty is how to select the appropriate window size: – Small window • can assure a fast adaptability in phases with concept changes • In more stable phases it can affect the learner performance – Large window • produce good and stable learning results in stable phases • can not react quickly to concept changes. PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 58 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 59. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Partial Memory - Windows • Fixed Size windows. – Store in memory a fixed number of the most recent examples. – Whenever a new example is available: • it is stored in memory and • the oldest one is discarded. – This is the simplest method to deal with concept drift and can be used as a baseline for comparisons. • Adaptive Size windows. – the set of examples in the window is variable. – They are used in conjunction with a detection model. – Decreasing the size of the window whenever the detection model signals drift and increasing otherwise. PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 59 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 60. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Discussion • The data management model also indicates the forgetting mechanism. – Weighting examples: • Corresponds to a gradual forgetting. • The relevance of old information is less and less important. – Time windows • Corresponds to abrupt forgetting; • The examples are deleted from memory. – Of course we can combine both forgetting mechanisms by weighting the examples in a time window. PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 60 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 61. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Discussion • These methods are blind adaptation models: – There is no explicit change detection – Do not provide: • Any indication about change points • Dynamics of the process generating data PAKDD-2011 Tutorial, A. Bifet, J.Gama, M. Pechenizkiy, I.Zliobaite 61 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 62. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Detection Methods • The Detection Model characterizes the techniques and mechanisms for explicit drift detection. – An advantage of the detection model is they can provide: • Meaningful descriptions: – indicating change-points or – small time-windows where the change occurs • Quantification of the changes. PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 62 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 63. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Detection Methods • Monitoring the evolution of performance indicators. (See Klinkenberg (98) for a good overview of these indicators) – Performance measures, – Properties of the data, etc • Monitoring distributions on two different time- windows (See D.Kifer, S.Ben-David, J.Gehrke; VLDB 04) – A reference window over past examples – A window over the most recent examples PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 63 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 64. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Monitoring the evolution of performance indicators • Klinkenberg (1998)proposed monitoring the values of three performance indicators: – Accuracy, recall and precision over time, – and then, comparing it to a confidence interval of standard sample errors – for a moving average value using the last M batches of each particular indicator. • FLORA family of algorithms (Widmer and Kubat, 1996) monitors accuracy and model size. – includes a window adjustment heuristic for a rule- based classifier. – FLORA3: dealing with recurring concepts. PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 64 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 65. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Monitoring distributions on two different time-windows • D.Kifer, S.Ben-David, J.Gehrke (VLDB 04) – Propose algorithms (statistical tests based on Chernoff bound) that examine samples drawn from two probability distributions and decide whether these distributions are different. • Gama, Fernandes, Rocha: VFDTc (IDA 2006) – For streaming data decision tree induction • Exploits tree structure: nested trees – Continuously monitoring differences between two class- distribution of the examples: • the distribution when a node was built and the class-distribution when a node was a leaf and • the weighted sum of the class-distributions in the leaves descendant of that node. PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 65 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 66. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Hoeffding Trees • Multiple windows – Each node recives information from different windows • Root node: oldest data • Leaves: most recent data PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 66 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 67. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook The RS Method • Implemented in the VFDTc system (IDA 2006) • For each decision node i, compute two estimates of the classification error. – Static error (SEi): • the distribution of the error at node i ; – Backed up error (BUEi): • the sum of the error distributions of all the descending leaves of the node i ; – With these two distributions: • we can detect the concept change, • by verifying the condition SEi ≤ BUEi PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 67 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 68. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook RS Method • Each new example traverses the tree from the root to a leaf – At the leaf: update the value of SEi – It makes the opposite path, and update the values of SEi and BUEi for each internal node, – Verify the regularization condition SEi ≤ BUEi : • If SEi ≤ BUEi then the node i is pruned to a new leaf. PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 68 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 69. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 69 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 70. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook ADAPTATION METHODS PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 70 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 71. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Adaptation Methods • The Adaptation model characterizes the changes in the decision model do adapt to the most recent examples. – Blind Methods: • Methods that adapt the learner at regular intervals without considering whether changes have really occurred. – Informed Methods: • Methods that only change the decision model after a change was detected. They are used in conjunction with a detection model. PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 71 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 72. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Granularity of Decision Models • Occurrences of drift can have impact in part of the instance space. – Global models: Require the reconstruction of all the decision model. • like naive Bayes, SVM, etc – Granular decision models: Require the reconstruction of parts of the decision model • like decision rules, decision trees PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 72 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 73. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook CVFDT algorithm • Mining time-changing data streams , G. Hulten, L. Spencer, and P. Domingos; ACM SIGKDD 2001 • Process examples from the stream indefinitely. – For each example (x, y), • Pass (x, y) down to a set of leaves using HT and all alternate trees of the nodes (x, y) passes through. • Add (x, y) to the sliding window of examples. • Periodically scans the internal nodes of HT. • Start a new alternate tree when a new winning attribute is found. – Tighter criteria to avoid excessive alternate tree creation. – Limit the total number of alternate trees PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 73 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 74. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Smoothly adjust to concept drift • Alternate trees are grown the Age<30? same way HT is. Yes No • Periodically each node with non-empty alternate trees enter a testing mode. Married? Car Type= Yes Sports Car? – M training examples to Yes No compare accuracy. Yes No – Prune alternate trees with Yes No Experience No non-increasing accuracy over <1 year? time. Yes No – Replace if an alternate tree is more accurate. No Yes PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 74 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 75. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Decision model management • Model management characterizes the number of decision models needed to maintain in memory. • The key issue here is the assumption that data generated comes from multiple distributions, – at least in the transition between contexts. – Instead of maintaining a single decision model several authors propose the use of multiple decision models. PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 75 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 76. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Dynamic Weighted Majority • A seminal work, is the system presented by Kolter and Maloof (ICDM03, ICML05). • The Dynamic Weighted Majority algorithm (DWM) is an ensemble method for tracking concept drift. – Maintains an ensemble of base learners, – Predicts using a weighted-majority vote of these experts. – Dynamically creates and deletes experts in response to changes in performance. PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 76 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 77. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook DETECTION ALGORITHMS PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 77 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 78. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Change Detection in Predictive Learning • When there is a change in the class-distribution of the examples: – The actual model does not correspond any more to the actual distribution. – The error-rate increases • Basic Idea: – Learning is a process. – Monitor the quality of the learning process: • Using Statistical Process Control techniques. – Monitor the evolution of the error rate. – Main Problems: • How to distinguish Change from Noise? • How to React to drift? PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 78 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 79. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook SPC-Statistical Process Control Learning with Drift Detection ,Gama, Medas, Gladys, Rodrigues; SBIA- LNCS Springer, 2004. • Suppose a sequence of examples in the form < xi,yi> • The actual decision model classifies each example in the sequence – In the 0-1 loss function, predictions are either True or False – The predictions of the learning algorithm are sequences: T, F,T, F,T, F,T,T,T, F, … – The Error is a random variable from Bernoulli trials. – The Binomial distribution gives the general form of the probability of observing a F: • pi = (#F/i) • where i is the number of trials. si = pi (1 − pi ) / i PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 79 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 80. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Statistical Processing Control Algorithm • Maintains two registers: – Pmin and Smin such that Pmin + Smin = min(pi + si ) – Minimum of the error rate taking into account the variance of the estimator. • At example j : the error of the learning algorithm will be – Out-control if pj + sj > pmin + α smin – In-control if pj + sj < pmin + β smin – Warning Level: if pmin + α smin > pj + sj > pmin + smin • The constants α and β depend on the desired confidence level. – Admissible values are = 2 and = 3. PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 80 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 81. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Statistical Processing Control PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 81 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 82. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Illustrative Example: SEA Concepts • The stream: PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 82 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 83. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook The Individual Error of the Models PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 83 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 84. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook System Error PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 84 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 85. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Sequential Analysis • Sequential analysis developed a set of methods for monitoring change detection: – Sequential Probability Ratio Test - SPRT algorithm, D.Wald, Sequential Tests of Statistical Hypotheses. Annals of Mathematical Statistics 16 (2): 117–186; 1945 – Cumulative Sums - CUSUM: E. Page, Continuous Inspection Scheme. Biometrika 41,1954 PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 85 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 86. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook CUSUM Algorithms The CUSUM test is used to detect significant increases (or decreases) in the successive observations of a random variable x • Detecting significant increases: • Detecting decreases: – g0 = 0 – g0 = 0 – gt = max(0; gt-1+ (xt - α)) – gt = min(0; gt-1+ (xt - α)) • The decision rule is: • The decision rule is: – if gt > λ then alarm and gt = 0. – if gt < λ then alarm and gt = 0. PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 86 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 87. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Page-Hinckley Test • The PH test is a sequential adaptation of the detection of an abrupt change in the average of a Gaussian signal. – It considers a cumulative variable mT , defined as the cumulated difference between the observed values and their mean till the current moment: PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 87 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 88. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook The Page-Hinckley Test • The minimum value of mt is also computed: – MT = min(mt ; t = 1, …,T). • The test monitors the difference between MT and mT : – PHT = mT - MT . • When this difference is greater than a given threshold (λ) we alarm a change in the distribution. PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 88 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 89. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Illustrative Example – The left figure plots the on-line error rate of a learning algorithm. – The center plot is the accumulated on-line error. The slope increases after the occurrence of a change. – The right plot presents the evolution of the PH statistic. PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 89 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 90. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Page-Hinckley Test • The PHt test is memoryless, and its accuracy depends on the choice of parameters α and λ. – Both parameters are relevant to control the trade- off between earlier detecting true alarms by allowing some false alarms. • The threshold λ depends on the admissible false alarm rate. – Increasing λ will entail fewer false alarms, but might miss some changes. PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 90 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 91. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Algorithm Adaptive Window-ADWIN • Learning from Time-Changing Data with Adaptive Windowing, A.Bifet, R.Gavaldà (SDM'07) • Whenever two “large enough” subwindows of W exhibit “distinct enough” averages, – we can conclude that the corresponding expected values are different, and – the older portion of the window is dropped PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 91 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 92. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Algorithm ADWIN • The value of εcut for a partition W0 . W1 of W is computed as follows: – Let n0 and n1 be the lengths of W0 and W1 and – Let n be the length of W, so n = n0 + n1. – Let ηW0 and ηW1 be the averages of the values in W0 and W1, and W0 and W1 their expected values. – To obtain totally rigorous performance guarantees we define: PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 92 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 93. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Characteristics of Algorithm ADWIN • False positive rate bound: – If ηt remains constant within W, the probability that ADWIN shrinks the window at this step is at most δ. • False negative rate bound. – Suppose that for some partition of W in two parts W0W1 (where W1 contains the most recent items) we have: |η W0 - η W1| > 2*εcut. Then with probability 1 - δ ADWIN shrinks W to W1, or shorter. PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 93 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 94. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Illustrative Examples Abrupt Change Gradual Change PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 94 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 95. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook ADWIN – Compressing Space • Using exponential histograms • Requires O(log W) space and time • It can provide exact counts for all the subwindows Example: Counting 1’s in a bit stream Compress: Compress again: A new 1 arrives: There are 3 buckets of size 1: There are 3 buckets of size 2: Remove buckets when a drift is detect:
  • 96. What CD is About :: Types of Drifts & Approaches :: Techniques in Detail :: Evaluation :: MOA Demo :: Types of Applications :: Outlook Summary Dimensions of Analysis: • Data management • Characterize the information stored in memory to maintain a decision model consistent with the actual state of the nature • Detection Methods • Characterizes the techniques and mechanisms for explicit drift detection • Adaptation methods • Characterizes the changes in the decision model do adapt to the most recent examples. • Decision model management • Characterize the number of decision models needed to maintain in memory. PAKDD-2011 Tutorial, A. Bifet, J. Gama, M. Pechenizkiy, I.Zliobaite 96 May 27, Shenzhen, China Handling Concept Drift: Importance, Challenges and Solutions
  • 97. What  CD  iis  About  ::  Types  of  Dri5s  &    Approaches  ::  Techniques  iin  Detail  ::  Evalua=on  :::  MOA  Demo  :::  Types  of  Applica=ons  :::  Outlook   What  CD   s  About  ::  Types  of  Dri5s  & Approaches  ::  Techniques   n  Detail  ::  Evalua.on   :  MOA  Demo   :  Types  of  Applica=ons   :  Outlook   Block  3:  evalua=on  and  MOA  demo   PAKDD-­‐2011  Tutorial,       A.  Bifet,  J.Gama,  M.  Pechenizkiy,  I.Zliobaite       98   May  27,  Shenzhen,  China   Handling  Concept  Dri5:  Importance,  Challenges  and  Solu=ons