SlideShare uma empresa Scribd logo
1 de 26
Baixar para ler offline
Ensembling & Boosting
概念介紹
Wayne Chen
201608
簡報目的
增加資料分析領域的 sense
遇到自稱打過比賽的人不會心裡涼涼的覺得你好神
Maybe 就算用不上 概念也有借鏡的地方
如果說 Deep Learning 改變了 ML 的遊戲規則
XGBoost : Kaggle Winning Solution
Giuliano Janson: Won two games and retired from Kaggle
Persistence: every Kaggler nowadays can put up a great model in a few hours
and usually achieve 95% of final score. Only persistence will get you the
remaining 5%.
Ensembling: need to know how to do it "like a pro". Forget about averaging
models. Nowadays many Kaggler do meta-models, and meta-meta-models.
Why Ensemble is needed?
奧卡姆剃刀 Occam's Razor
● An explanation of the data should be made as simple as possible, but no simpler.
簡單的方法,勝過複雜的方法。 Simple s good. 任何的浪費都是不好的
將多個簡單的模型組合起來,效果比一個複雜的模型還要好
● Training data might not provide sufficient information for choosing a single best learner.
● The search processes of the learning algorithms might be imperfect (difficult to achieve unique
best hypothesis)
● Hypothesis space being searched might not contain the true target function.
所謂簡單的方法是指
ID3, C4.5, CART … Tree base method
Entropy
ex. 找出愛花錢的人,以性別作為切分 5 愛(1M,4F), 9 不愛(6M,3F)
● E_all → -5/14 * log(5/14) - 9/14 * log(9/14)
● Entropy is 1 if 50% - 50%, 0 if 100% - 0%
Information Gain
● 選擇 a 當作 split attribute,之後 Entropy 比原本減少了多少
● E_gender → P(M) * E(1,6) + P(F) * E(4,3) Gain = E_all - E_gender
http://www.saedsayad.com/decision_tree.htm
這樣會有什麼問題?
越精準的模型可能是越偏頗的
http://blogs.sas.com/content/jmp/2013/03/25/partitioning-a-quadratic-in-jmp/
一句話講完 Boost Ensemble
知錯能改、善莫大焉
學習就是一遍一遍的的對錯誤加重記憶,然後改進
做錯的事就沒有後悔藥吃了,記取教訓努力在未來不再犯錯
1. 錯了就錯了,不要丟掉,也不要執著
2. 記住錯在哪裡,下次加重學習
3. 一直學到考試都可以考一百分 (誤)
一秒鐘學會用 Ensemble
我想你已經 try 過一些不同 model 了
● Decision tree, NN, SVM, Regression ..
Ensemble Kaggle submission CSV files. → It’s work!
Majority Voting
● Three models : 70%, 70%, 70%
● Majority vote ensemble will be ~78%.
● Averaging predictions often reduces overfit.
http://mlwave.com/kaggle-ensembling-guide/
Ensemble 的陷阱
把 Kobe, Curry, LBJ 組一隊,就會拿總冠軍嗎?
Uncorrelated models usually performed better
As more accurate as possible, and as more diverse aspossible
常見機制 Majority Vote, Weighted Averaging
Voting Ensemble → RandomForest → GradientBoostingMachine
1111111100 = 80% accuracy
1111111100 = 80% accuracy
1011111100 = 70% accuracy
1111111100 = 80% accuracy
1111111100 = 80% accuracy
0111011101 = 70% accuracy
1000101111 = 60% accuracy
1111111101 = 90% accuracy
你一定聽過的
Ensemble 方法
● Randomly sampling not
only dat but also feature
● Majority vote
● Minimal tuning
● Performance pass lots of
complex method
n: subsample size
m: subfeature set size
tree size, tree number
http://www.slideshare.net/0xdata/jan-vitek-distributedrandomforest522013
Base Learner:被拿來 ensemble 的基礎模型 ex. 一棵樹, simple neural network
● Train by base learning algorithm (ex. decision tree, neural network ..)
三大訓練方法分支:
● Boosting - Boost weak learners too strong learners (sequential learners)
● Bagging - Like RandomForest, sampling from data or features
● Stacking - 打包的概念 (parallel learners)
● Employing different learning algorithms to train individual learners
● Individual learners then combined by a second-level learner which is
called meta-learner.
Ensemble 的關鍵字
Bagging Ensemble Bootstrap Aggregating
每次取樣m個資料點 (bootstrap sample) train base learner by calling a base
learning algorithm
● Sampling 的比例是學問
● 甚至針對不同特徵的子資料集 train 不同 model
○ Cherkauer(1996) 火山鑑定工程 32 NN,依據不同 input feature 切分
● 加入 randomness 元素
○ backpropagation random init, tree random select feature
● Majority voting
優點 -- 保留整體假說的多樣化特徵
Boost Family
● AdaBoost (Adaptive Boosting)
● Gradient Tree Boosting
● XGBoost
Conbination of Additive Models
學習收斂效能好
有放大雜訊的危險性
● Bagging can significantly reduce the variance
● Boosting can significantly reduce the bias
http://slideplayer.com/slide/4816467/
Assigns equal weights to all the training examples,
increased the weights of incorrectly classified examples.
Adaboost 特性介紹
在大部分情況下,可以有非常好的
表現,但對於雜訊的放大,是其必
須克服的地方。
在每一次的分類中,我們要提升被
分錯的點再下一次被分對的機率,
以及降低被分錯的機率。
http://www.37steps.com/exam/adaboost_comp/html/adaboost_comp.html
Gradient Boosting
Additive training
● New predictor is optimized by moving in the opposite direction of the
gradient to minimize the loss function.
GBDT 中的決策樹深度較小一般不會超過5,葉子節點的數量也不會超過10
● Boosted Tree: GBDT, GBRT, MART, LambdaMART
Gradient Boosting Model Steps
● Leaf weighted cost score
● Additive training: 加入一個新模型到模型中 → 選擇一個
加入後 cost error 下降最多的模型
● Greedy algorithm to build new tree from a single leaf
● Gradient update weight
Training Tips
Shrinkage
● Reduces the influence of each individual tree and leaves space for
future trees to improve the model.
● Better to improve model by many small steps than lagre steps.
Subsampling, Early Stopping, Post-Prunning
● In 2015, 29 challenge winning solutions, 17 used XGBoost (deep neural
nets 11)
● KDDCup 2015 all winning solution mention it.
● 用了直接上 leaderboard top 10
Scalability enables data scientists to process hundred millions of examples
on a desktop.
● OpenMP CPU multi-thread
● DMatrix
● Cache-aware and Sparsity-aware
為什麼 XGBoost 這麼威
Column Block for Parallel Learning
The most time consuming part of tree learning is to get the data into sorted
order.
In memory block, compressed column format, each column sorted by the
corresponding feature value. Block Compression, Block Sharding.
Results
Use it in Python
xgb_model = XGBClassifier( learning_rate =0.1, n_estimators=1000,
max_depth=5, min_child_weight=1, gamma=0, subsample=0.8,
colsample_bytree=0.8, objective= 'binary:logistic', nthread=8,
scale_pos_weight=1, seed=27)
● gamma : Minimum loss reduction required to make a further partition on a
leaf node of the tree.
● min_child_weight : Minimum sum of instance weight(hessian) needed in a
child.
● colsample_bytree : Subsample ratio of columns when constructing each
tree.
Ensamble in Kaggle
Voting ensembles, Weighted majority vote, Bagged Perceptrons, Rank
averaging, Historical ranks, Stacked & Blending (Netflix)
圖片分類比賽
● Voting ensemble of around 30 convnets. The best single model scored
0.93170. Final score 0.94120.
Ensemble in Kaggle
No Free Lunch
Ensemble is much better than single learner.
Bias-variance tradeoff → Boosting or Average vote it.
● Not understandable -- like DNN, Non-linear SVM
● There is no ensemble method which outperforms other ensemble methods
consistently
Selecting some base learners instead of using all of them to compose an
ensemble is a better choice -- selective ensembles
XGBoost(tabular data) v.s. Deep Learning(more & complex data, hard tuning)
Reference
● Gradient boosting machines, a tutorial Alexey Natekin1* and Alois Knoll2
● XGBoost: A Scalable Tree Boosting System - Tianqi Chen
● NTU cmlab http://www.cmlab.csie.ntu.edu.tw/~cyy/learning/tutorials/
● http://mlwave.com/kaggle-ensembling-guide/

Mais conteúdo relacionado

Mais procurados

古典的ゲームAIを用いたAlphaGo解説
古典的ゲームAIを用いたAlphaGo解説古典的ゲームAIを用いたAlphaGo解説
古典的ゲームAIを用いたAlphaGo解説suckgeun lee
 
表形式データで高性能な予測モデルを構築する「DNNとXGBoostのアンサンブル学習」
表形式データで高性能な予測モデルを構築する「DNNとXGBoostのアンサンブル学習」表形式データで高性能な予測モデルを構築する「DNNとXGBoostのアンサンブル学習」
表形式データで高性能な予測モデルを構築する「DNNとXGBoostのアンサンブル学習」西岡 賢一郎
 
目で見る過学習と正則化
目で見る過学習と正則化目で見る過学習と正則化
目で見る過学習と正則化y-uti
 
自習形式で学ぶ「DIGITS による画像分類入門」
自習形式で学ぶ「DIGITS による画像分類入門」自習形式で学ぶ「DIGITS による画像分類入門」
自習形式で学ぶ「DIGITS による画像分類入門」NVIDIA Japan
 
Predicting the age of abalone
Predicting the age of abalonePredicting the age of abalone
Predicting the age of abalonehyperak
 
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張Preferred Networks
 
第五回統計学勉強会@東大駒場
第五回統計学勉強会@東大駒場第五回統計学勉強会@東大駒場
第五回統計学勉強会@東大駒場Daisuke Yoneoka
 
CF-FinML 金融時系列予測のための機械学習
CF-FinML 金融時系列予測のための機械学習CF-FinML 金融時系列予測のための機械学習
CF-FinML 金融時系列予測のための機械学習Katsuya Ito
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature EngineeringHJ van Veen
 
クラシックな機械学習の入門 6. 最適化と学習アルゴリズム
クラシックな機械学習の入門  6. 最適化と学習アルゴリズムクラシックな機械学習の入門  6. 最適化と学習アルゴリズム
クラシックな機械学習の入門 6. 最適化と学習アルゴリズムHiroshi Nakagawa
 
機械学習をこれから始める人が読んでおきたい 特徴選択の有名論文紹介
機械学習をこれから始める人が読んでおきたい 特徴選択の有名論文紹介機械学習をこれから始める人が読んでおきたい 特徴選択の有名論文紹介
機械学習をこれから始める人が読んでおきたい 特徴選択の有名論文紹介西岡 賢一郎
 
Introduction to soft computing
Introduction to soft computingIntroduction to soft computing
Introduction to soft computingAnkush Kumar
 
[DL輪読会]representation learning via invariant causal mechanisms
[DL輪読会]representation learning via invariant causal mechanisms[DL輪読会]representation learning via invariant causal mechanisms
[DL輪読会]representation learning via invariant causal mechanismsDeep Learning JP
 
Statistics vs machine learning
Statistics vs machine learningStatistics vs machine learning
Statistics vs machine learningTom Dierickx
 
2014 3 13(テンソル分解の基礎)
2014 3 13(テンソル分解の基礎)2014 3 13(テンソル分解の基礎)
2014 3 13(テンソル分解の基礎)Tatsuya Yokota
 
劣モジュラ最適化と機械学習 2.4節
劣モジュラ最適化と機械学習 2.4節劣モジュラ最適化と機械学習 2.4節
劣モジュラ最適化と機械学習 2.4節Hakky St
 
Matlantisに込められた 技術・思想_高本_Matlantis User Conference
Matlantisに込められた 技術・思想_高本_Matlantis User ConferenceMatlantisに込められた 技術・思想_高本_Matlantis User Conference
Matlantisに込められた 技術・思想_高本_Matlantis User ConferenceMatlantis
 
[DL輪読会]YOLO9000: Better, Faster, Stronger
[DL輪読会]YOLO9000: Better, Faster, Stronger[DL輪読会]YOLO9000: Better, Faster, Stronger
[DL輪読会]YOLO9000: Better, Faster, StrongerDeep Learning JP
 
サポートベクターマシン(SVM)の数学をみんなに説明したいだけの会
サポートベクターマシン(SVM)の数学をみんなに説明したいだけの会サポートベクターマシン(SVM)の数学をみんなに説明したいだけの会
サポートベクターマシン(SVM)の数学をみんなに説明したいだけの会Kenyu Uehara
 
ブラックボックスからXAI (説明可能なAI) へ - LIME (Local Interpretable Model-agnostic Explanat...
ブラックボックスからXAI (説明可能なAI) へ - LIME (Local Interpretable Model-agnostic Explanat...ブラックボックスからXAI (説明可能なAI) へ - LIME (Local Interpretable Model-agnostic Explanat...
ブラックボックスからXAI (説明可能なAI) へ - LIME (Local Interpretable Model-agnostic Explanat...西岡 賢一郎
 

Mais procurados (20)

古典的ゲームAIを用いたAlphaGo解説
古典的ゲームAIを用いたAlphaGo解説古典的ゲームAIを用いたAlphaGo解説
古典的ゲームAIを用いたAlphaGo解説
 
表形式データで高性能な予測モデルを構築する「DNNとXGBoostのアンサンブル学習」
表形式データで高性能な予測モデルを構築する「DNNとXGBoostのアンサンブル学習」表形式データで高性能な予測モデルを構築する「DNNとXGBoostのアンサンブル学習」
表形式データで高性能な予測モデルを構築する「DNNとXGBoostのアンサンブル学習」
 
目で見る過学習と正則化
目で見る過学習と正則化目で見る過学習と正則化
目で見る過学習と正則化
 
自習形式で学ぶ「DIGITS による画像分類入門」
自習形式で学ぶ「DIGITS による画像分類入門」自習形式で学ぶ「DIGITS による画像分類入門」
自習形式で学ぶ「DIGITS による画像分類入門」
 
Predicting the age of abalone
Predicting the age of abalonePredicting the age of abalone
Predicting the age of abalone
 
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張
 
第五回統計学勉強会@東大駒場
第五回統計学勉強会@東大駒場第五回統計学勉強会@東大駒場
第五回統計学勉強会@東大駒場
 
CF-FinML 金融時系列予測のための機械学習
CF-FinML 金融時系列予測のための機械学習CF-FinML 金融時系列予測のための機械学習
CF-FinML 金融時系列予測のための機械学習
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
クラシックな機械学習の入門 6. 最適化と学習アルゴリズム
クラシックな機械学習の入門  6. 最適化と学習アルゴリズムクラシックな機械学習の入門  6. 最適化と学習アルゴリズム
クラシックな機械学習の入門 6. 最適化と学習アルゴリズム
 
機械学習をこれから始める人が読んでおきたい 特徴選択の有名論文紹介
機械学習をこれから始める人が読んでおきたい 特徴選択の有名論文紹介機械学習をこれから始める人が読んでおきたい 特徴選択の有名論文紹介
機械学習をこれから始める人が読んでおきたい 特徴選択の有名論文紹介
 
Introduction to soft computing
Introduction to soft computingIntroduction to soft computing
Introduction to soft computing
 
[DL輪読会]representation learning via invariant causal mechanisms
[DL輪読会]representation learning via invariant causal mechanisms[DL輪読会]representation learning via invariant causal mechanisms
[DL輪読会]representation learning via invariant causal mechanisms
 
Statistics vs machine learning
Statistics vs machine learningStatistics vs machine learning
Statistics vs machine learning
 
2014 3 13(テンソル分解の基礎)
2014 3 13(テンソル分解の基礎)2014 3 13(テンソル分解の基礎)
2014 3 13(テンソル分解の基礎)
 
劣モジュラ最適化と機械学習 2.4節
劣モジュラ最適化と機械学習 2.4節劣モジュラ最適化と機械学習 2.4節
劣モジュラ最適化と機械学習 2.4節
 
Matlantisに込められた 技術・思想_高本_Matlantis User Conference
Matlantisに込められた 技術・思想_高本_Matlantis User ConferenceMatlantisに込められた 技術・思想_高本_Matlantis User Conference
Matlantisに込められた 技術・思想_高本_Matlantis User Conference
 
[DL輪読会]YOLO9000: Better, Faster, Stronger
[DL輪読会]YOLO9000: Better, Faster, Stronger[DL輪読会]YOLO9000: Better, Faster, Stronger
[DL輪読会]YOLO9000: Better, Faster, Stronger
 
サポートベクターマシン(SVM)の数学をみんなに説明したいだけの会
サポートベクターマシン(SVM)の数学をみんなに説明したいだけの会サポートベクターマシン(SVM)の数学をみんなに説明したいだけの会
サポートベクターマシン(SVM)の数学をみんなに説明したいだけの会
 
ブラックボックスからXAI (説明可能なAI) へ - LIME (Local Interpretable Model-agnostic Explanat...
ブラックボックスからXAI (説明可能なAI) へ - LIME (Local Interpretable Model-agnostic Explanat...ブラックボックスからXAI (説明可能なAI) へ - LIME (Local Interpretable Model-agnostic Explanat...
ブラックボックスからXAI (説明可能なAI) へ - LIME (Local Interpretable Model-agnostic Explanat...
 

Semelhante a Ensembling & Boosting 概念介紹

Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntKaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntEugene Yan Ziyou
 
Aaa ped-14-Ensemble Learning: About Ensemble Learning
Aaa ped-14-Ensemble Learning: About Ensemble LearningAaa ped-14-Ensemble Learning: About Ensemble Learning
Aaa ped-14-Ensemble Learning: About Ensemble LearningAminaRepo
 
Escaping the Black Box
Escaping the Black BoxEscaping the Black Box
Escaping the Black BoxRebecca Bilbro
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature EngineeringSri Ambati
 
XGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionXGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionJaroslaw Szymczak
 
Decision tree and ensemble
Decision tree and ensembleDecision tree and ensemble
Decision tree and ensembleDanbi Cho
 
Reading group gan - 20170417
Reading group   gan - 20170417Reading group   gan - 20170417
Reading group gan - 20170417Shuai Zhang
 
Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat omarodibat
 
XGBOOST [Autosaved]12.pptx
XGBOOST [Autosaved]12.pptxXGBOOST [Autosaved]12.pptx
XGBOOST [Autosaved]12.pptxyadav834181
 
Ensemble learning Techniques
Ensemble learning TechniquesEnsemble learning Techniques
Ensemble learning TechniquesBabu Priyavrat
 
Module 6: Ensemble Algorithms
Module 6:  Ensemble AlgorithmsModule 6:  Ensemble Algorithms
Module 6: Ensemble AlgorithmsSara Hooker
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and BoostingMohit Rajput
 
[PR12] categorical reparameterization with gumbel softmax
[PR12] categorical reparameterization with gumbel softmax[PR12] categorical reparameterization with gumbel softmax
[PR12] categorical reparameterization with gumbel softmaxJaeJun Yoo
 
ML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptxML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptxMayankChadha14
 

Semelhante a Ensembling & Boosting 概念介紹 (20)

Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntKaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
 
Aaa ped-14-Ensemble Learning: About Ensemble Learning
Aaa ped-14-Ensemble Learning: About Ensemble LearningAaa ped-14-Ensemble Learning: About Ensemble Learning
Aaa ped-14-Ensemble Learning: About Ensemble Learning
 
gan.pdf
gan.pdfgan.pdf
gan.pdf
 
2021 04-01-dalle
2021 04-01-dalle2021 04-01-dalle
2021 04-01-dalle
 
Ensemble methods
Ensemble methodsEnsemble methods
Ensemble methods
 
Escaping the Black Box
Escaping the Black BoxEscaping the Black Box
Escaping the Black Box
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Ensemble methods
Ensemble methods Ensemble methods
Ensemble methods
 
Decision tree
Decision treeDecision tree
Decision tree
 
XGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionXGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competition
 
Decision tree and ensemble
Decision tree and ensembleDecision tree and ensemble
Decision tree and ensemble
 
Reading group gan - 20170417
Reading group   gan - 20170417Reading group   gan - 20170417
Reading group gan - 20170417
 
Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat Boosting Algorithms Omar Odibat
Boosting Algorithms Omar Odibat
 
XGBOOST [Autosaved]12.pptx
XGBOOST [Autosaved]12.pptxXGBOOST [Autosaved]12.pptx
XGBOOST [Autosaved]12.pptx
 
Ensemble learning Techniques
Ensemble learning TechniquesEnsemble learning Techniques
Ensemble learning Techniques
 
Module 6: Ensemble Algorithms
Module 6:  Ensemble AlgorithmsModule 6:  Ensemble Algorithms
Module 6: Ensemble Algorithms
 
Machine Learning - Supervised Learning
Machine Learning - Supervised LearningMachine Learning - Supervised Learning
Machine Learning - Supervised Learning
 
Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and Boosting
 
[PR12] categorical reparameterization with gumbel softmax
[PR12] categorical reparameterization with gumbel softmax[PR12] categorical reparameterization with gumbel softmax
[PR12] categorical reparameterization with gumbel softmax
 
ML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptxML Study Jams - Session 3.pptx
ML Study Jams - Session 3.pptx
 

Último

Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024Timothy Spann
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 

Último (20)

Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 

Ensembling & Boosting 概念介紹

  • 3. 如果說 Deep Learning 改變了 ML 的遊戲規則 XGBoost : Kaggle Winning Solution Giuliano Janson: Won two games and retired from Kaggle Persistence: every Kaggler nowadays can put up a great model in a few hours and usually achieve 95% of final score. Only persistence will get you the remaining 5%. Ensembling: need to know how to do it "like a pro". Forget about averaging models. Nowadays many Kaggler do meta-models, and meta-meta-models.
  • 4. Why Ensemble is needed? 奧卡姆剃刀 Occam's Razor ● An explanation of the data should be made as simple as possible, but no simpler. 簡單的方法,勝過複雜的方法。 Simple s good. 任何的浪費都是不好的 將多個簡單的模型組合起來,效果比一個複雜的模型還要好 ● Training data might not provide sufficient information for choosing a single best learner. ● The search processes of the learning algorithms might be imperfect (difficult to achieve unique best hypothesis) ● Hypothesis space being searched might not contain the true target function.
  • 5. 所謂簡單的方法是指 ID3, C4.5, CART … Tree base method Entropy ex. 找出愛花錢的人,以性別作為切分 5 愛(1M,4F), 9 不愛(6M,3F) ● E_all → -5/14 * log(5/14) - 9/14 * log(9/14) ● Entropy is 1 if 50% - 50%, 0 if 100% - 0% Information Gain ● 選擇 a 當作 split attribute,之後 Entropy 比原本減少了多少 ● E_gender → P(M) * E(1,6) + P(F) * E(4,3) Gain = E_all - E_gender http://www.saedsayad.com/decision_tree.htm
  • 7. 一句話講完 Boost Ensemble 知錯能改、善莫大焉 學習就是一遍一遍的的對錯誤加重記憶,然後改進 做錯的事就沒有後悔藥吃了,記取教訓努力在未來不再犯錯 1. 錯了就錯了,不要丟掉,也不要執著 2. 記住錯在哪裡,下次加重學習 3. 一直學到考試都可以考一百分 (誤)
  • 8. 一秒鐘學會用 Ensemble 我想你已經 try 過一些不同 model 了 ● Decision tree, NN, SVM, Regression .. Ensemble Kaggle submission CSV files. → It’s work! Majority Voting ● Three models : 70%, 70%, 70% ● Majority vote ensemble will be ~78%. ● Averaging predictions often reduces overfit. http://mlwave.com/kaggle-ensembling-guide/
  • 9. Ensemble 的陷阱 把 Kobe, Curry, LBJ 組一隊,就會拿總冠軍嗎? Uncorrelated models usually performed better As more accurate as possible, and as more diverse aspossible 常見機制 Majority Vote, Weighted Averaging Voting Ensemble → RandomForest → GradientBoostingMachine 1111111100 = 80% accuracy 1111111100 = 80% accuracy 1011111100 = 70% accuracy 1111111100 = 80% accuracy 1111111100 = 80% accuracy 0111011101 = 70% accuracy 1000101111 = 60% accuracy 1111111101 = 90% accuracy
  • 10. 你一定聽過的 Ensemble 方法 ● Randomly sampling not only dat but also feature ● Majority vote ● Minimal tuning ● Performance pass lots of complex method n: subsample size m: subfeature set size tree size, tree number http://www.slideshare.net/0xdata/jan-vitek-distributedrandomforest522013
  • 11. Base Learner:被拿來 ensemble 的基礎模型 ex. 一棵樹, simple neural network ● Train by base learning algorithm (ex. decision tree, neural network ..) 三大訓練方法分支: ● Boosting - Boost weak learners too strong learners (sequential learners) ● Bagging - Like RandomForest, sampling from data or features ● Stacking - 打包的概念 (parallel learners) ● Employing different learning algorithms to train individual learners ● Individual learners then combined by a second-level learner which is called meta-learner. Ensemble 的關鍵字
  • 12. Bagging Ensemble Bootstrap Aggregating 每次取樣m個資料點 (bootstrap sample) train base learner by calling a base learning algorithm ● Sampling 的比例是學問 ● 甚至針對不同特徵的子資料集 train 不同 model ○ Cherkauer(1996) 火山鑑定工程 32 NN,依據不同 input feature 切分 ● 加入 randomness 元素 ○ backpropagation random init, tree random select feature ● Majority voting 優點 -- 保留整體假說的多樣化特徵
  • 13. Boost Family ● AdaBoost (Adaptive Boosting) ● Gradient Tree Boosting ● XGBoost Conbination of Additive Models 學習收斂效能好 有放大雜訊的危險性 ● Bagging can significantly reduce the variance ● Boosting can significantly reduce the bias
  • 14. http://slideplayer.com/slide/4816467/ Assigns equal weights to all the training examples, increased the weights of incorrectly classified examples.
  • 16. Gradient Boosting Additive training ● New predictor is optimized by moving in the opposite direction of the gradient to minimize the loss function. GBDT 中的決策樹深度較小一般不會超過5,葉子節點的數量也不會超過10 ● Boosted Tree: GBDT, GBRT, MART, LambdaMART
  • 17. Gradient Boosting Model Steps ● Leaf weighted cost score ● Additive training: 加入一個新模型到模型中 → 選擇一個 加入後 cost error 下降最多的模型 ● Greedy algorithm to build new tree from a single leaf ● Gradient update weight
  • 18. Training Tips Shrinkage ● Reduces the influence of each individual tree and leaves space for future trees to improve the model. ● Better to improve model by many small steps than lagre steps. Subsampling, Early Stopping, Post-Prunning
  • 19. ● In 2015, 29 challenge winning solutions, 17 used XGBoost (deep neural nets 11) ● KDDCup 2015 all winning solution mention it. ● 用了直接上 leaderboard top 10 Scalability enables data scientists to process hundred millions of examples on a desktop. ● OpenMP CPU multi-thread ● DMatrix ● Cache-aware and Sparsity-aware 為什麼 XGBoost 這麼威
  • 20. Column Block for Parallel Learning The most time consuming part of tree learning is to get the data into sorted order. In memory block, compressed column format, each column sorted by the corresponding feature value. Block Compression, Block Sharding.
  • 22. Use it in Python xgb_model = XGBClassifier( learning_rate =0.1, n_estimators=1000, max_depth=5, min_child_weight=1, gamma=0, subsample=0.8, colsample_bytree=0.8, objective= 'binary:logistic', nthread=8, scale_pos_weight=1, seed=27) ● gamma : Minimum loss reduction required to make a further partition on a leaf node of the tree. ● min_child_weight : Minimum sum of instance weight(hessian) needed in a child. ● colsample_bytree : Subsample ratio of columns when constructing each tree.
  • 23. Ensamble in Kaggle Voting ensembles, Weighted majority vote, Bagged Perceptrons, Rank averaging, Historical ranks, Stacked & Blending (Netflix)
  • 24. 圖片分類比賽 ● Voting ensemble of around 30 convnets. The best single model scored 0.93170. Final score 0.94120. Ensemble in Kaggle
  • 25. No Free Lunch Ensemble is much better than single learner. Bias-variance tradeoff → Boosting or Average vote it. ● Not understandable -- like DNN, Non-linear SVM ● There is no ensemble method which outperforms other ensemble methods consistently Selecting some base learners instead of using all of them to compose an ensemble is a better choice -- selective ensembles XGBoost(tabular data) v.s. Deep Learning(more & complex data, hard tuning)
  • 26. Reference ● Gradient boosting machines, a tutorial Alexey Natekin1* and Alois Knoll2 ● XGBoost: A Scalable Tree Boosting System - Tianqi Chen ● NTU cmlab http://www.cmlab.csie.ntu.edu.tw/~cyy/learning/tutorials/ ● http://mlwave.com/kaggle-ensembling-guide/