SlideShare a Scribd company logo
1 of 18
bagusco@gmail.com
bagusco@ipb.ac.id
KDD Cup 2010: Overview
• The Challenge
   – How generally or narrowly do students learn? How quickly or
     slowly? Will the rate of improvement vary between students?
     What does it mean for one problem to be similar to another?

   – Is it possible to infer the knowledge requirements of problems
     directly from student performance data, without human analysis
     of the tasks?

   – This year's challenge asks you to predict student performance
     on mathematical problems from logs of student interaction with
     Intelligent Tutoring Systems.
KDD Cup 2010: Results
• Winners of KDD Cup 2010: All Teams
   – First Place: National Taiwan University
     Feature engineering and classifier ensembling for KDD CUP
     2010

   – First Runner Up: Zhang and Su
     Gradient Boosting Machines with Singular Value Decomposition

   – Second Runner Up: BigChaos @ KDD
     Collaborative Filtering Applied to Educational Data Mining
Outline
•   What is Ensemble Learning?
•   Why Ensemble?
•   How good is Ensemble?
•   What next?
Predictive Modeling
• Widely-used in many applications:
  – Business
     • Churn modeling, Scoring
  – Science
     • Chemometrics
  – Bio-Science
     • Efficacy modeling, Classification
  – Academics
     • Admission selection, student performance
Predictive Modeling
                          New
                         Data Set


Training      Model      Predictive   Prediction
  Set      Development     Rules
Classical Approach: Model Selection




   Which one is the best?
New Approach?: Ensemble




  Combine all models!!!
What is Ensemble?
• Single Expert   vs   Team of Experts
What is Ensemble?
                        Data Set



   Training Set #1   Training Set #2   ……   Training Set #k
                                       .

     Learning           Learning              Learning
                                       ……
     Model #1           Model #2              Model #k
                                       .

                        Combiner



                        Ensemble
                        Prediction
Types of Ensemble
• Hybrid Ensemble
  – Combining several different learning algorithms into
    one prediction
  – e.g: combining the result of regression, tree, neural
    nets, and support vector machine

• Non-Hybrid Ensemble
  – Combining several learning models from the same
    algorithm into one prediction
Well-Known Ensembles
• Bagging
  – Generate learning models for the bootstrap samples
  – Aggregate the predictions via averaging or majority-vote
• Boosting (AdaBoost)
  – Generate sequential learning models with higher weight to
    ‘difficult’ cases
  – Combine the predictions by concerning the weight
• Random Forest
  – Similar to bagging except the existence of random feature
    selection for each learning model generation
How Good is Ensemble?
Error Rate
 0.7
                           tree
 0.6
                           bagging
 0.5                       adaboost
 0.4
 0.3
 0.2
 0.1
  0
       1   2   3   4   5   6   7   8   9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33


  Source: Dietterich (1999)
How Good is Ensemble?
AUC
 0.9

 0.8                                                   CART
                                                       C45
 0.7
                                                       Bagging
 0.6                                                   Random Forest
                                                       Rotation Forest
 0.5
                                                       Rotation Boost
 0.4
           DIY          Bank   Telecom1   Mail-order
Source: Bock & Poel (2011)
What Next
• Ensemble Predictive Models

• Class-Imbalance Models
  – Gradient
    Boosting, EasyEnsemble, BalanceCascade, SMOTE
    Boost

• Robust Predictive Models
  – Noise Ensemble
Ensemble in SAS/EM
THANK YOU
Bagus Sartono
Educational Background        Professional Experience
• Bachelor of Science in      • Lecturer – Dept of Stats
  Stats – IPB (2000)            IPB
• Master of Science in        • Experienced Trainer in
  Stats – IPB (2004)            Analytics (Bank
• PhD in Applied                Indonesia, Bank
  Economics – University of     Mandiri, Ganesha Cipta
  Antwerp (2012)                Informatika, CIFOR, LIPI,
                                LPEM-UI, etc)

More Related Content

Viewers also liked

Digital media for marketing meeting 2011
Digital media for marketing meeting 2011Digital media for marketing meeting 2011
Digital media for marketing meeting 2011cmviasat
 
WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』
WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』
WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』takuo yamada
 
Jonge Democraten - Individueel pensioen zonder sociale partners.
Jonge Democraten - Individueel pensioen zonder sociale partners.Jonge Democraten - Individueel pensioen zonder sociale partners.
Jonge Democraten - Individueel pensioen zonder sociale partners.BeFrank
 
INTERIOR-iD Portfolio
INTERIOR-iD PortfolioINTERIOR-iD Portfolio
INTERIOR-iD PortfolioRadaschitz
 
Increasing talent mobility: (Open) Badges @ Selor
Increasing talent mobility: (Open) Badges @ SelorIncreasing talent mobility: (Open) Badges @ Selor
Increasing talent mobility: (Open) Badges @ SelorVincent Van Malderen
 
The Future of Content Marketing
The Future of Content MarketingThe Future of Content Marketing
The Future of Content MarketingLucia Novara
 
Programming SharePoint 2010 with Visual Studio 2010
Programming SharePoint 2010 with Visual Studio 2010Programming SharePoint 2010 with Visual Studio 2010
Programming SharePoint 2010 with Visual Studio 2010Quang Nguyễn Bá
 
39808 sum orientation2011_sav_graduate_ppt
39808 sum orientation2011_sav_graduate_ppt39808 sum orientation2011_sav_graduate_ppt
39808 sum orientation2011_sav_graduate_pptTreyReckling
 
pension jugement
pension jugementpension jugement
pension jugementraph98
 
2 phil lit, pre colonial period
2 phil lit, pre colonial period2 phil lit, pre colonial period
2 phil lit, pre colonial periodMarien Be
 
自転車通勤のススメ
自転車通勤のススメ自転車通勤のススメ
自転車通勤のススメtakuo yamada
 
Unlearning unlimited
Unlearning unlimitedUnlearning unlimited
Unlearning unlimitedPravin Sabnis
 
Building a $100k and flexible design career
Building a $100k and flexible design careerBuilding a $100k and flexible design career
Building a $100k and flexible design careeradambcarney
 

Viewers also liked (19)

Chapter 7
Chapter 7Chapter 7
Chapter 7
 
Digital media for marketing meeting 2011
Digital media for marketing meeting 2011Digital media for marketing meeting 2011
Digital media for marketing meeting 2011
 
WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』
WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』
WCAN 2013 Spring ライトニングトーク『学びのコツを掴んで変化が激しい時代を楽しもう』
 
Jonge Democraten - Individueel pensioen zonder sociale partners.
Jonge Democraten - Individueel pensioen zonder sociale partners.Jonge Democraten - Individueel pensioen zonder sociale partners.
Jonge Democraten - Individueel pensioen zonder sociale partners.
 
INTERIOR-iD Portfolio
INTERIOR-iD PortfolioINTERIOR-iD Portfolio
INTERIOR-iD Portfolio
 
California
California California
California
 
Increasing talent mobility: (Open) Badges @ Selor
Increasing talent mobility: (Open) Badges @ SelorIncreasing talent mobility: (Open) Badges @ Selor
Increasing talent mobility: (Open) Badges @ Selor
 
The Future of Content Marketing
The Future of Content MarketingThe Future of Content Marketing
The Future of Content Marketing
 
The compass
The compassThe compass
The compass
 
A REFORMA E A CONTRARREFORMA
A REFORMA E A CONTRARREFORMAA REFORMA E A CONTRARREFORMA
A REFORMA E A CONTRARREFORMA
 
Programming SharePoint 2010 with Visual Studio 2010
Programming SharePoint 2010 with Visual Studio 2010Programming SharePoint 2010 with Visual Studio 2010
Programming SharePoint 2010 with Visual Studio 2010
 
39808 sum orientation2011_sav_graduate_ppt
39808 sum orientation2011_sav_graduate_ppt39808 sum orientation2011_sav_graduate_ppt
39808 sum orientation2011_sav_graduate_ppt
 
pension jugement
pension jugementpension jugement
pension jugement
 
Rbp ph
Rbp phRbp ph
Rbp ph
 
2 phil lit, pre colonial period
2 phil lit, pre colonial period2 phil lit, pre colonial period
2 phil lit, pre colonial period
 
自転車通勤のススメ
自転車通勤のススメ自転車通勤のススメ
自転車通勤のススメ
 
C
CC
C
 
Unlearning unlimited
Unlearning unlimitedUnlearning unlimited
Unlearning unlimited
 
Building a $100k and flexible design career
Building a $100k and flexible design careerBuilding a $100k and flexible design career
Building a $100k and flexible design career
 

Similar to Improving the Model’s Predictive Power with Ensemble Approaches

Predict oscars (4:17)
Predict oscars (4:17)Predict oscars (4:17)
Predict oscars (4:17)Thinkful
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning IntroductionDong Guo
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?Tuan Yang
 
Hadoop Summit 2010 Machine Learning Using Hadoop
Hadoop Summit 2010 Machine Learning Using HadoopHadoop Summit 2010 Machine Learning Using Hadoop
Hadoop Summit 2010 Machine Learning Using HadoopYahoo Developer Network
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for EveryoneAly Abdelkareem
 
Predict the Oscars with Data Science
Predict the Oscars with Data SciencePredict the Oscars with Data Science
Predict the Oscars with Data ScienceCarlos Edo
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptxMonicaTimber
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfAnkita Tiwari
 
Predict oscars (5:11)
Predict oscars (5:11)Predict oscars (5:11)
Predict oscars (5:11)Thinkful
 
Machine Learning: Learning with data
Machine Learning: Learning with dataMachine Learning: Learning with data
Machine Learning: Learning with dataONE Talks
 
One talk Machine Learning
One talk Machine LearningOne talk Machine Learning
One talk Machine LearningONE Talks
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balanceAlex Henderson
 
[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...
[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...
[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...Gilles Vandewiele
 
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptxRaflyRizky2
 
CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...
CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...
CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...statisfactions
 
Predict the Oscars with Data Science
Predict the Oscars with Data SciencePredict the Oscars with Data Science
Predict the Oscars with Data ScienceThinkful
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...MLAI2
 
李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning台灣資料科學年會
 

Similar to Improving the Model’s Predictive Power with Ensemble Approaches (20)

Predict oscars (4:17)
Predict oscars (4:17)Predict oscars (4:17)
Predict oscars (4:17)
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?
 
Hadoop Summit 2010 Machine Learning Using Hadoop
Hadoop Summit 2010 Machine Learning Using HadoopHadoop Summit 2010 Machine Learning Using Hadoop
Hadoop Summit 2010 Machine Learning Using Hadoop
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for Everyone
 
Predict the Oscars with Data Science
Predict the Oscars with Data SciencePredict the Oscars with Data Science
Predict the Oscars with Data Science
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptx
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdf
 
Predict oscars (5:11)
Predict oscars (5:11)Predict oscars (5:11)
Predict oscars (5:11)
 
Machine Learning: Learning with data
Machine Learning: Learning with dataMachine Learning: Learning with data
Machine Learning: Learning with data
 
One talk Machine Learning
One talk Machine LearningOne talk Machine Learning
One talk Machine Learning
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balance
 
[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...
[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...
[ESWC2017 - PhD Symposium] Enhancing white-box machine learning processes by ...
 
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
 
Machine learning
Machine learning Machine learning
Machine learning
 
CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...
CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...
CATALST intro stats course presentation at JMM 2013 (Elizabeth Fry, Laura Zie...
 
What is Machine Learning
What is Machine LearningWhat is Machine Learning
What is Machine Learning
 
Predict the Oscars with Data Science
Predict the Oscars with Data SciencePredict the Oscars with Data Science
Predict the Oscars with Data Science
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
 
李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning
 

More from SAS Asia Pacific

Produce Analytical Talent to Meet the Industry Needs
Produce Analytical Talent to Meet the Industry NeedsProduce Analytical Talent to Meet the Industry Needs
Produce Analytical Talent to Meet the Industry NeedsSAS Asia Pacific
 
Better decisions through analytics in healthcare industry. Our journey so far
Better decisions through analytics in healthcare industry.  Our journey so farBetter decisions through analytics in healthcare industry.  Our journey so far
Better decisions through analytics in healthcare industry. Our journey so farSAS Asia Pacific
 
How can Analytics Drive Customer Values?
How can Analytics Drive Customer Values?How can Analytics Drive Customer Values?
How can Analytics Drive Customer Values?SAS Asia Pacific
 
Developing an Analytical Mindset – Becoming an Analytical Competitor
Developing an Analytical Mindset – Becoming an Analytical CompetitorDeveloping an Analytical Mindset – Becoming an Analytical Competitor
Developing an Analytical Mindset – Becoming an Analytical CompetitorSAS Asia Pacific
 
Gaining New Insights into Usage Log Data
Gaining New Insights into Usage Log Data Gaining New Insights into Usage Log Data
Gaining New Insights into Usage Log Data SAS Asia Pacific
 
Predictive Analytics: Advanced techniques in data mining
Predictive Analytics: Advanced techniques in data miningPredictive Analytics: Advanced techniques in data mining
Predictive Analytics: Advanced techniques in data miningSAS Asia Pacific
 
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...SAS Asia Pacific
 
A journey through the spatial data mining and geographic knowledge discovery ...
A journey through the spatial data mining and geographic knowledge discovery ...A journey through the spatial data mining and geographic knowledge discovery ...
A journey through the spatial data mining and geographic knowledge discovery ...SAS Asia Pacific
 

More from SAS Asia Pacific (9)

Produce Analytical Talent to Meet the Industry Needs
Produce Analytical Talent to Meet the Industry NeedsProduce Analytical Talent to Meet the Industry Needs
Produce Analytical Talent to Meet the Industry Needs
 
Better decisions through analytics in healthcare industry. Our journey so far
Better decisions through analytics in healthcare industry.  Our journey so farBetter decisions through analytics in healthcare industry.  Our journey so far
Better decisions through analytics in healthcare industry. Our journey so far
 
How can Analytics Drive Customer Values?
How can Analytics Drive Customer Values?How can Analytics Drive Customer Values?
How can Analytics Drive Customer Values?
 
Developing an Analytical Mindset – Becoming an Analytical Competitor
Developing an Analytical Mindset – Becoming an Analytical CompetitorDeveloping an Analytical Mindset – Becoming an Analytical Competitor
Developing an Analytical Mindset – Becoming an Analytical Competitor
 
Gaining New Insights into Usage Log Data
Gaining New Insights into Usage Log Data Gaining New Insights into Usage Log Data
Gaining New Insights into Usage Log Data
 
Predictive Analytics: Advanced techniques in data mining
Predictive Analytics: Advanced techniques in data miningPredictive Analytics: Advanced techniques in data mining
Predictive Analytics: Advanced techniques in data mining
 
Technology Update
Technology UpdateTechnology Update
Technology Update
 
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...
A Journey through the Spatial Data Mining and Geographic Knowledge Discover J...
 
A journey through the spatial data mining and geographic knowledge discovery ...
A journey through the spatial data mining and geographic knowledge discovery ...A journey through the spatial data mining and geographic knowledge discovery ...
A journey through the spatial data mining and geographic knowledge discovery ...
 

Recently uploaded

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Improving the Model’s Predictive Power with Ensemble Approaches

  • 2. KDD Cup 2010: Overview • The Challenge – How generally or narrowly do students learn? How quickly or slowly? Will the rate of improvement vary between students? What does it mean for one problem to be similar to another? – Is it possible to infer the knowledge requirements of problems directly from student performance data, without human analysis of the tasks? – This year's challenge asks you to predict student performance on mathematical problems from logs of student interaction with Intelligent Tutoring Systems.
  • 3. KDD Cup 2010: Results • Winners of KDD Cup 2010: All Teams – First Place: National Taiwan University Feature engineering and classifier ensembling for KDD CUP 2010 – First Runner Up: Zhang and Su Gradient Boosting Machines with Singular Value Decomposition – Second Runner Up: BigChaos @ KDD Collaborative Filtering Applied to Educational Data Mining
  • 4. Outline • What is Ensemble Learning? • Why Ensemble? • How good is Ensemble? • What next?
  • 5. Predictive Modeling • Widely-used in many applications: – Business • Churn modeling, Scoring – Science • Chemometrics – Bio-Science • Efficacy modeling, Classification – Academics • Admission selection, student performance
  • 6. Predictive Modeling New Data Set Training Model Predictive Prediction Set Development Rules
  • 7. Classical Approach: Model Selection Which one is the best?
  • 8. New Approach?: Ensemble Combine all models!!!
  • 9. What is Ensemble? • Single Expert vs Team of Experts
  • 10. What is Ensemble? Data Set Training Set #1 Training Set #2 …… Training Set #k . Learning Learning Learning …… Model #1 Model #2 Model #k . Combiner Ensemble Prediction
  • 11. Types of Ensemble • Hybrid Ensemble – Combining several different learning algorithms into one prediction – e.g: combining the result of regression, tree, neural nets, and support vector machine • Non-Hybrid Ensemble – Combining several learning models from the same algorithm into one prediction
  • 12. Well-Known Ensembles • Bagging – Generate learning models for the bootstrap samples – Aggregate the predictions via averaging or majority-vote • Boosting (AdaBoost) – Generate sequential learning models with higher weight to ‘difficult’ cases – Combine the predictions by concerning the weight • Random Forest – Similar to bagging except the existence of random feature selection for each learning model generation
  • 13. How Good is Ensemble? Error Rate 0.7 tree 0.6 bagging 0.5 adaboost 0.4 0.3 0.2 0.1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 Source: Dietterich (1999)
  • 14. How Good is Ensemble? AUC 0.9 0.8 CART C45 0.7 Bagging 0.6 Random Forest Rotation Forest 0.5 Rotation Boost 0.4 DIY Bank Telecom1 Mail-order Source: Bock & Poel (2011)
  • 15. What Next • Ensemble Predictive Models • Class-Imbalance Models – Gradient Boosting, EasyEnsemble, BalanceCascade, SMOTE Boost • Robust Predictive Models – Noise Ensemble
  • 18. Bagus Sartono Educational Background Professional Experience • Bachelor of Science in • Lecturer – Dept of Stats Stats – IPB (2000) IPB • Master of Science in • Experienced Trainer in Stats – IPB (2004) Analytics (Bank • PhD in Applied Indonesia, Bank Economics – University of Mandiri, Ganesha Cipta Antwerp (2012) Informatika, CIFOR, LIPI, LPEM-UI, etc)