SlideShare uma empresa Scribd logo
1 de 44
Classification-
Decision Tree
Classification by Decision Tree
Induction
• Decision tree
– A flow-chart-like tree structure
– Internal node denotes a test on an attribute
– Branch represents an outcome of the test
– Leaf nodes represent class labels or class distribution
– The topmost node in the tree is the root node.
• Decision tree generation consists of two phases
– Tree construction
• At start, all the training examples are at the root
• Partition examples recursively based on selected attributes
– Tree pruning
• Identify and remove branches that reflect noise or outliers
• Use of decision tree: Classifying an unknown sample
– Test the attribute values of the sample against the decision tree
Decision Tree for PlayTennis
Outlook
Sunny Overcast Rain
Humidity
High Normal
No Yes
Each internal node
tests an attribute
Each branch corresponds to an
attribute value node
Each leaf node assigns a classification
(1) Which to start? (root)
(2) Which node to
proceed?
(3) When to stop/ come to conclusion?
Decision trees classify instances or examples by starting at the root of the tree
and moving through it until a leaf node.
Decision Tree for Conjunction
Outlook
Sunny Overcast Rain
Wind
Strong Weak
No Yes
No
Outlook=Sunny  Wind=Weak
No
Decision Tree for Disjunction
Outlook
Sunny Overcast Rain
Yes
Outlook=Sunny  Wind=Weak
Wind
Strong Weak
No Yes
Wind
Strong Weak
No Yes
Decision Tree for XOR
Outlook
Sunny Overcast Rain
Wind
Strong Weak
Yes No
Outlook=Sunny XOR Wind=Weak
Wind
Strong Weak
No Yes
Wind
Strong Weak
No Yes
Outlook
Sunny Overcast Rain
Humidity
High Normal
Wind
Strong Weak
No Yes
Yes
YesNo
• decision trees represent disjunctions of conjunctions
(Outlook=Sunny  Humidity=Normal)
 (Outlook=Overcast)
 (Outlook=Rain  Wind=Weak)
Decision Tree
When to consider Decision Trees
• Instances describable by attribute-value pairs
• Target function is discrete valued
• Disjunctive hypothesis may be required
• Possibly noisy training data
• Missing attribute values
• Examples:
– Medical diagnosis
– Credit risk analysis
– Object classification for robot manipulator (Tan
1993)
A simple example
• You want to guess the outcome of next week's game
between the MallRats and the Chinooks.
• Available knowledge / Attribute
– was the game at Home or Away
– was the starting time 5pm, 7pm or 9pm.
– Did Joe play center, or forward.
– whether that opponent's center was tall or not.
– …..
Basket ball data
What we know ?
• The game will be away, at 9pm, and that Joe will play
center on offense…
• A classification problem
• Generalizing the learned rule to new examples
• What you don't know, of course, is who will win this game.
• Of course, it is reasonable to assume that this future game will
resemble the past games. Note, however, there are no previous games
that match these specific values -- ie, no previous game was exactly
[Where=Away, When=9pm, FredStarts=No, JoeOffense=Center,
JoeDefends=Forward, OppC=Tall].
We therefore need to generalize -- by using the known examples to infer
the likely outcome of this new situation. But how?
Use a Decision Tree to determine who should win the game
As we did not indicate the outcome of this game we call this an
"unlabeled instance"; the goal of a classifier is finding the class label for
such unlabeled instances.
An instance that also includes the outcome is called a "labeled instance" ---
eg, the first row of the table
corresponds to the labeled instance
Decision Trees
In general, a decision tree is a tree structure; see left-hard
figure below.
Example of a Decision Tree
Tid Refund Marital
Status
Taxable
Income Cheat
1 Yes Single 125K No
2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10
Refund
MarSt
TaxInc
YESNO
NO
NO
Yes No
MarriedSingle, Divorced
< 80K > 80K
Training Data Model: Decision Tree
Apply Model to Test Data
Refund
MarSt
TaxInc
YESNO
NO
NO
Yes No
MarriedSingle, Divorced
< 80K > 80K
Refund Marital
Status
Taxable
Income Cheat
No Married 80K ?
10
Test Data
Start at the root of tree
Apply Model to Test Data
Refund
MarSt
TaxInc
YESNO
NO
NO
Yes No
MarriedSingle, Divorced
< 80K > 80K
Refund Marital
Status
Taxable
Income Cheat
No Married 80K ?
10
Test Data
Apply Model to Test Data
Refund
MarSt
TaxInc
YESNO
NO
NO
Yes No
MarriedSingle, Divorced
< 80K > 80K
Refund Marital
Status
Taxable
Income Cheat
No Married 80K ?
10
Test Data
Apply Model to Test Data
Refund
MarSt
TaxInc
YESNO
NO
NO
Yes No
MarriedSingle, Divorced
< 80K > 80K
Refund Marital
Status
Taxable
Income Cheat
No Married 80K ?
10
Test Data
Apply Model to Test Data
Refund
MarSt
TaxInc
YESNO
NO
NO
Yes No
MarriedSingle, Divorced
< 80K > 80K
Refund Marital
Status
Taxable
Income Cheat
No Married 80K ?
10
Test Data
Apply Model to Test Data
Refund
MarSt
TaxInc
YESNO
NO
NO
Yes No
MarriedSingle, Divorced
< 80K > 80K
Refund Marital
Status
Taxable
Income Cheat
No Married 80K ?
10
Test Data
Assign Cheat to “No”
Principle
‒ Basic algorithm (adopted by ID3, C4.5 and CART): a greedy algorithm
‒ Tree is constructed in a top-down recursive divide-and-conquer manner
‒ Attributes are categorical (if continuous-valued, they are discretized in
advance)
‒ Choose the best attribute(s) to split the remaining instances and make
that attribute a decision node
Iterations
‒ At start, all the training tuples are at the root
‒ Tuples are partitioned recursively based on selected attributes
‒ Test attributes are selected on the basis of a heuristic or statistical
measure (e.g., information gain)
Stopping conditions
‒ All samples for a given node belong to the same class
‒ There are no remaining attributes for further partitioning – majority
voting is employed for classifying the leaf
‒ There are no samples left
Decision Tree Algorithm
Example
Example
Example
Three Possible Partition Scenarios
How to choose An Attribute?
• An attribute selection measure is a heuristic for selecting the splitting
criterion that “best” separates a given data partition, D, of class
labeled training tuples into individual classes.
Ideally
‒ Each resulting partition would be pure
‒ A pure partition is a partition containing tuples that all belong to the
same class
• Attribute selection measures (splitting rules)
‒ Determine how the tuples at a given node are to be split
‒ Provide ranking for each attribute describing the tuples
‒ The attribute with highest score is chosen
‒ Determine a split point or a splitting subset
• Methods
– Information gain (ID3 (Iterative Dichotomiser 3) /C4.5)
– Gain ratio
– Gini Index (IBM IntelligentMiner)
Attribute Selection Measures
Before Describing Information Gain
Entropy is a measure of the average information content one
is missing when one does not know the value of the random
variable.
– Shannon's metric of "Entropy" of information is a foundational
concept of information theory.
– The entropy of a variable is the "amount of information"
contained in the variable.
High Entropy
– X is from a uniform like distribution
– Flat histogram
– Values sampled from it are less predictable
Low Entropy
– X is from a varied (peaks and valleys) distribution
– Histogram has many lows and highs
– Values sampled from it are more predictable
1st approach: Information Gain
Approach
Information Gain Approach
Assume there are two classes, P and N
Let the set of examples D contain p elements of class P
and n elements of class N
The amount of information, needed to decide if an
arbitrary example in D belongs to P or N is defined as
Info(D) =
np
n
np
n
np
p
np
p
npI



 22 loglog),(
Information Gain Approach
log2x=log10x/log102
Info(D): Example
Information Gain in Attribute
• Assume that using attribute A a set D will be
partitioned into sets {D1, D2 , …, Dv}
– If D contains pi examples of P and ni examples of N, the
entropy, or the expected information needed to classify
objects in all subtrees Si is
• The encoding information that would be gained by
branching on A

 



1
),()(
i
ii
ii
npI
np
np
AE
)(),()( AEnpIAGain 
Information Gain in Attribute
Infoage(D): Example
Information Gain in Attribute
Infoage(D): Example
Extracting Classification Rules from
Trees
• Represent the knowledge in the form of IF-THEN rules
• One rule is created for each path from the root to a leaf
• Each attribute-value pair along a path forms a conjunction
• The leaf node holds the class prediction
• Rules are easier for humans to understand
• Example
IF age = “<=30” AND student = “no” THEN buys_computer = “no”
IF age = “<=30” AND student = “yes” THEN buys_computer = “yes”
IF age = “31…40” THEN buys_computer = “yes”
IF age = “>40” AND credit_rating = “excellent” THEN buys_computer
= “yes”
IF age = “>40” AND credit_rating = “fair” THEN buys_computer = “no”
Avoid Overfitting in
Classification
• The generated tree may overfit the training data
– Too many branches, some may reflect anomalies
due to noise or outliers
– Result is in poor accuracy for unseen samples
• Two approaches to avoid overfitting
– Prepruning: Halt tree construction early—do not
split a node if this would result in the goodness
measure falling below a threshold
• Difficult to choose an appropriate threshold
– Postpruning: Remove branches from a “fully grown”
tree—get a sequence of progressively pruned trees
• Use a set of data different from the training data
to decide which is the “best pruned tree”
Approaches to Determine the Final
Tree Size
• Separate training (2/3) and testing (1/3) sets
• Use cross validation, e.g., 10-fold cross
validation
• Use all the data for training
– but apply a statistical test (e.g., chi-square) to
estimate whether expanding or pruning a node
may improve the entire distribution
• Use minimum description length (MDL) principle:
– halting growth of the tree when the encoding is
Enhancements to basic decision
tree induction
• Allow for continuous-valued attributes
– Dynamically define new discrete-valued attributes that
partition the continuous attribute value into a discrete
set of intervals
• Handle missing attribute values
– Assign the most common value of the attribute
– Assign probability to each of the possible values
• Attribute construction
– Create new attributes based on existing ones that are
sparsely represented
– This reduces fragmentation, repetition, and replication
Sore Throat Fever Swollen Glands Congestion Headache Diagnosis
YES YES YES YES YES Strep Throat
NO NO NO YES YES Allergy
YES YES NO YES NO Cold
YES NO YES NO NO Strep Throat
NO YES NO YES NO Cold
NO NO NO YES NO Allergy
NO NO YES NO NO Strep Throat
YES NO NO YES YES Allergy
NO YES NO YES YES Cold
YES YES NO YES YES Cold
Exercise: For the following Medical Diagnosis Data, create a
decision tree.
 
2 2 2
2 2
10 10
10 10
3 3 3 3 4 4
log log log
10 10 10 10 10 10
0.3log (0.3) 2 0.4log (0.4)
log (0.3) log (0.4)
0.6 0.4
log 2 log 2
( 0.522) ( 0.397)
0.6 0.4
0.301 0.301
0.6(1.73) 0.4(1.
InfoGain
      
         
      
   
 
   
 
  
    
  318)
1.038 0.5272 1.562  
S=Strep Throat (3)+Allergy(3)+Cold(4)=10
Info(S)=1.562
Finding Splitting Attribute
• Select Attribute with highest Gain
Sore
Throat=
Strep
Throat
Allergy Cold
YES 2 1 2
NO 1 2 2
Information Gain x P
Information Gain x P
+ = Entropy
Sore
Throat=
2 2 2
2 2 1 1 2 2
( ) log log log
5 5 5 5 5 5
( ) 1.52
Info YES
Info YES
      
         
      

2 2 2
1 1 2 2 2 2
( ) log log log
5 5 5 5 5 5
( ) 1.52
Info NO
Info NO
      
         
      

Entropy (E(Sore Throat)= P(YES)x1.52 + P(NO)x1.52
= (5/10)x1.52 + (5/10)x1.52 = 1.52
Gain (Sore Throat)= Info(S)-E(Sore Throat)
= 1.562-1.52 = 0.05
• Gain for each Attribute
Attribute Gain
Sore Throat 0.05
Fever 0.72
Swallen Glands 0.88
Congestion 0.45
Headache 0.05
Decision Tree
Swallen
Glands
YesNo
Diagnosis=Strep Throat
Fever
YesNo
Diagnosis=ColdDiagnosis=Allergy
IF Swallen Glands = “YES”, THEN Diagnosis=Strep Throat
IF Swallen Glands = “NO” AND Fever = “YES”, THEN Diagnosis=Cold
IF Swallen Glands = “NO” AND Fever = “NO”, THEN Diagnosis=Allergy

Mais conteúdo relacionado

Mais procurados

Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CARTXueping Peng
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
Data mining technique (decision tree)
Data mining technique (decision tree)Data mining technique (decision tree)
Data mining technique (decision tree)Shweta Ghate
 
Slide3.ppt
Slide3.pptSlide3.ppt
Slide3.pptbutest
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision treesKnoldus Inc.
 
Anomaly detection with machine learning at scale
Anomaly detection with machine learning at scaleAnomaly detection with machine learning at scale
Anomaly detection with machine learning at scaleImpetus Technologies
 
Machine Learning Clustering
Machine Learning ClusteringMachine Learning Clustering
Machine Learning ClusteringRupak Roy
 
Decision Trees
Decision TreesDecision Trees
Decision TreesStudent
 
Over fitting underfitting
Over fitting underfittingOver fitting underfitting
Over fitting underfittingSivapriyaS12
 
Decision tree in artificial intelligence
Decision tree in artificial intelligenceDecision tree in artificial intelligence
Decision tree in artificial intelligenceMdAlAmin187
 
Chapter 4 Classification
Chapter 4 ClassificationChapter 4 Classification
Chapter 4 ClassificationKhalid Elshafie
 
Decision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceDecision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceMaryamRehman6
 
CART – Classification & Regression Trees
CART – Classification & Regression TreesCART – Classification & Regression Trees
CART – Classification & Regression TreesHemant Chetwani
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree inductionthamizh arasi
 
Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Kush Kulshrestha
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesDr. C.V. Suresh Babu
 

Mais procurados (20)

Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CART
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
Data mining technique (decision tree)
Data mining technique (decision tree)Data mining technique (decision tree)
Data mining technique (decision tree)
 
Slide3.ppt
Slide3.pptSlide3.ppt
Slide3.ppt
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Decision tree
Decision treeDecision tree
Decision tree
 
Anomaly detection with machine learning at scale
Anomaly detection with machine learning at scaleAnomaly detection with machine learning at scale
Anomaly detection with machine learning at scale
 
Machine Learning Clustering
Machine Learning ClusteringMachine Learning Clustering
Machine Learning Clustering
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Decision tree
Decision treeDecision tree
Decision tree
 
Over fitting underfitting
Over fitting underfittingOver fitting underfitting
Over fitting underfitting
 
Decision tree in artificial intelligence
Decision tree in artificial intelligenceDecision tree in artificial intelligence
Decision tree in artificial intelligence
 
Chapter 4 Classification
Chapter 4 ClassificationChapter 4 Classification
Chapter 4 Classification
 
Decision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data scienceDecision tree induction \ Decision Tree Algorithm with Example| Data science
Decision tree induction \ Decision Tree Algorithm with Example| Data science
 
Decision tree
Decision treeDecision tree
Decision tree
 
Clustering
ClusteringClustering
Clustering
 
CART – Classification & Regression Trees
CART – Classification & Regression TreesCART – Classification & Regression Trees
CART – Classification & Regression Trees
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree induction
 
Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching Techniques
 

Semelhante a Lect9 Decision tree

CSA 3702 machine learning module 2
CSA 3702 machine learning module 2CSA 3702 machine learning module 2
CSA 3702 machine learning module 2Nandhini S
 
NN Classififcation Neural Network NN.pptx
NN Classififcation   Neural Network NN.pptxNN Classififcation   Neural Network NN.pptx
NN Classififcation Neural Network NN.pptxcmpt cmpt
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forestsViet-Trung TRAN
 
AI -learning and machine learning.pptx
AI  -learning and machine learning.pptxAI  -learning and machine learning.pptx
AI -learning and machine learning.pptxGaytriDhingra1
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptxWanderer20
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptxWanderer20
 
Big Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptxBig Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptxPlacementsBCA
 
Classification & Clustering.pptx
Classification & Clustering.pptxClassification & Clustering.pptx
Classification & Clustering.pptxImXaib
 
Predictive analytics
Predictive analyticsPredictive analytics
Predictive analyticsDinakar nk
 
Business Analytics using R.ppt
Business Analytics using R.pptBusiness Analytics using R.ppt
Business Analytics using R.pptRohit Raj
 

Semelhante a Lect9 Decision tree (20)

CSA 3702 machine learning module 2
CSA 3702 machine learning module 2CSA 3702 machine learning module 2
CSA 3702 machine learning module 2
 
Lecture4.pptx
Lecture4.pptxLecture4.pptx
Lecture4.pptx
 
NN Classififcation Neural Network NN.pptx
NN Classififcation   Neural Network NN.pptxNN Classififcation   Neural Network NN.pptx
NN Classififcation Neural Network NN.pptx
 
7 decision tree
7 decision tree7 decision tree
7 decision tree
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
Data Mining
Data MiningData Mining
Data Mining
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forests
 
Machine Learning
Machine Learning Machine Learning
Machine Learning
 
AI -learning and machine learning.pptx
AI  -learning and machine learning.pptxAI  -learning and machine learning.pptx
AI -learning and machine learning.pptx
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptx
 
module_3_1.pptx
module_3_1.pptxmodule_3_1.pptx
module_3_1.pptx
 
Big Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptxBig Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptx
 
Classification & Clustering.pptx
Classification & Clustering.pptxClassification & Clustering.pptx
Classification & Clustering.pptx
 
Predictive analytics
Predictive analyticsPredictive analytics
Predictive analytics
 
Business Analytics using R.ppt
Business Analytics using R.pptBusiness Analytics using R.ppt
Business Analytics using R.ppt
 
Dbm630 lecture06
Dbm630 lecture06Dbm630 lecture06
Dbm630 lecture06
 

Mais de hktripathy

Lect 3 background mathematics
Lect 3 background mathematicsLect 3 background mathematics
Lect 3 background mathematicshktripathy
 
Lect 2 getting to know your data
Lect 2 getting to know your dataLect 2 getting to know your data
Lect 2 getting to know your datahktripathy
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introductionhktripathy
 
Lecture7.1 data sampling
Lecture7.1 data samplingLecture7.1 data sampling
Lecture7.1 data samplinghktripathy
 
Lecture5 virtualization
Lecture5 virtualizationLecture5 virtualization
Lecture5 virtualizationhktripathy
 
Lecture6 introduction to data streams
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streamshktripathy
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundationshktripathy
 
Lecture3 business intelligence
Lecture3 business intelligenceLecture3 business intelligence
Lecture3 business intelligencehktripathy
 
Lecture2 big data life cycle
Lecture2 big data life cycleLecture2 big data life cycle
Lecture2 big data life cyclehktripathy
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big datahktripathy
 
Lect8 Classification & prediction
Lect8 Classification & predictionLect8 Classification & prediction
Lect8 Classification & predictionhktripathy
 
Lect7 Association analysis to correlation analysis
Lect7 Association analysis to correlation analysisLect7 Association analysis to correlation analysis
Lect7 Association analysis to correlation analysishktripathy
 
Lect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithmLect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithmhktripathy
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysishktripathy
 
Lect4 principal component analysis-I
Lect4 principal component analysis-ILect4 principal component analysis-I
Lect4 principal component analysis-Ihktripathy
 
Lect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data MiningLect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data Mininghktripathy
 
Lect 2 getting to know your data
Lect 2 getting to know your dataLect 2 getting to know your data
Lect 2 getting to know your datahktripathy
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introductionhktripathy
 

Mais de hktripathy (18)

Lect 3 background mathematics
Lect 3 background mathematicsLect 3 background mathematics
Lect 3 background mathematics
 
Lect 2 getting to know your data
Lect 2 getting to know your dataLect 2 getting to know your data
Lect 2 getting to know your data
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Lecture7.1 data sampling
Lecture7.1 data samplingLecture7.1 data sampling
Lecture7.1 data sampling
 
Lecture5 virtualization
Lecture5 virtualizationLecture5 virtualization
Lecture5 virtualization
 
Lecture6 introduction to data streams
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streams
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundations
 
Lecture3 business intelligence
Lecture3 business intelligenceLecture3 business intelligence
Lecture3 business intelligence
 
Lecture2 big data life cycle
Lecture2 big data life cycleLecture2 big data life cycle
Lecture2 big data life cycle
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
Lect8 Classification & prediction
Lect8 Classification & predictionLect8 Classification & prediction
Lect8 Classification & prediction
 
Lect7 Association analysis to correlation analysis
Lect7 Association analysis to correlation analysisLect7 Association analysis to correlation analysis
Lect7 Association analysis to correlation analysis
 
Lect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithmLect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithm
 
Lect5 principal component analysis
Lect5 principal component analysisLect5 principal component analysis
Lect5 principal component analysis
 
Lect4 principal component analysis-I
Lect4 principal component analysis-ILect4 principal component analysis-I
Lect4 principal component analysis-I
 
Lect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data MiningLect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data Mining
 
Lect 2 getting to know your data
Lect 2 getting to know your dataLect 2 getting to know your data
Lect 2 getting to know your data
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 

Último

Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17Celine George
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 

Último (20)

Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

Lect9 Decision tree

  • 2. Classification by Decision Tree Induction • Decision tree – A flow-chart-like tree structure – Internal node denotes a test on an attribute – Branch represents an outcome of the test – Leaf nodes represent class labels or class distribution – The topmost node in the tree is the root node. • Decision tree generation consists of two phases – Tree construction • At start, all the training examples are at the root • Partition examples recursively based on selected attributes – Tree pruning • Identify and remove branches that reflect noise or outliers • Use of decision tree: Classifying an unknown sample – Test the attribute values of the sample against the decision tree
  • 3. Decision Tree for PlayTennis Outlook Sunny Overcast Rain Humidity High Normal No Yes Each internal node tests an attribute Each branch corresponds to an attribute value node Each leaf node assigns a classification (1) Which to start? (root) (2) Which node to proceed? (3) When to stop/ come to conclusion? Decision trees classify instances or examples by starting at the root of the tree and moving through it until a leaf node.
  • 4. Decision Tree for Conjunction Outlook Sunny Overcast Rain Wind Strong Weak No Yes No Outlook=Sunny  Wind=Weak No
  • 5. Decision Tree for Disjunction Outlook Sunny Overcast Rain Yes Outlook=Sunny  Wind=Weak Wind Strong Weak No Yes Wind Strong Weak No Yes
  • 6. Decision Tree for XOR Outlook Sunny Overcast Rain Wind Strong Weak Yes No Outlook=Sunny XOR Wind=Weak Wind Strong Weak No Yes Wind Strong Weak No Yes
  • 7. Outlook Sunny Overcast Rain Humidity High Normal Wind Strong Weak No Yes Yes YesNo • decision trees represent disjunctions of conjunctions (Outlook=Sunny  Humidity=Normal)  (Outlook=Overcast)  (Outlook=Rain  Wind=Weak) Decision Tree
  • 8. When to consider Decision Trees • Instances describable by attribute-value pairs • Target function is discrete valued • Disjunctive hypothesis may be required • Possibly noisy training data • Missing attribute values • Examples: – Medical diagnosis – Credit risk analysis – Object classification for robot manipulator (Tan 1993)
  • 9. A simple example • You want to guess the outcome of next week's game between the MallRats and the Chinooks. • Available knowledge / Attribute – was the game at Home or Away – was the starting time 5pm, 7pm or 9pm. – Did Joe play center, or forward. – whether that opponent's center was tall or not. – …..
  • 11. What we know ? • The game will be away, at 9pm, and that Joe will play center on offense… • A classification problem • Generalizing the learned rule to new examples • What you don't know, of course, is who will win this game. • Of course, it is reasonable to assume that this future game will resemble the past games. Note, however, there are no previous games that match these specific values -- ie, no previous game was exactly [Where=Away, When=9pm, FredStarts=No, JoeOffense=Center, JoeDefends=Forward, OppC=Tall]. We therefore need to generalize -- by using the known examples to infer the likely outcome of this new situation. But how?
  • 12. Use a Decision Tree to determine who should win the game As we did not indicate the outcome of this game we call this an "unlabeled instance"; the goal of a classifier is finding the class label for such unlabeled instances. An instance that also includes the outcome is called a "labeled instance" --- eg, the first row of the table corresponds to the labeled instance
  • 13. Decision Trees In general, a decision tree is a tree structure; see left-hard figure below.
  • 14. Example of a Decision Tree Tid Refund Marital Status Taxable Income Cheat 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes 10 Refund MarSt TaxInc YESNO NO NO Yes No MarriedSingle, Divorced < 80K > 80K Training Data Model: Decision Tree
  • 15. Apply Model to Test Data Refund MarSt TaxInc YESNO NO NO Yes No MarriedSingle, Divorced < 80K > 80K Refund Marital Status Taxable Income Cheat No Married 80K ? 10 Test Data Start at the root of tree
  • 16. Apply Model to Test Data Refund MarSt TaxInc YESNO NO NO Yes No MarriedSingle, Divorced < 80K > 80K Refund Marital Status Taxable Income Cheat No Married 80K ? 10 Test Data
  • 17. Apply Model to Test Data Refund MarSt TaxInc YESNO NO NO Yes No MarriedSingle, Divorced < 80K > 80K Refund Marital Status Taxable Income Cheat No Married 80K ? 10 Test Data
  • 18. Apply Model to Test Data Refund MarSt TaxInc YESNO NO NO Yes No MarriedSingle, Divorced < 80K > 80K Refund Marital Status Taxable Income Cheat No Married 80K ? 10 Test Data
  • 19. Apply Model to Test Data Refund MarSt TaxInc YESNO NO NO Yes No MarriedSingle, Divorced < 80K > 80K Refund Marital Status Taxable Income Cheat No Married 80K ? 10 Test Data
  • 20. Apply Model to Test Data Refund MarSt TaxInc YESNO NO NO Yes No MarriedSingle, Divorced < 80K > 80K Refund Marital Status Taxable Income Cheat No Married 80K ? 10 Test Data Assign Cheat to “No”
  • 21. Principle ‒ Basic algorithm (adopted by ID3, C4.5 and CART): a greedy algorithm ‒ Tree is constructed in a top-down recursive divide-and-conquer manner ‒ Attributes are categorical (if continuous-valued, they are discretized in advance) ‒ Choose the best attribute(s) to split the remaining instances and make that attribute a decision node Iterations ‒ At start, all the training tuples are at the root ‒ Tuples are partitioned recursively based on selected attributes ‒ Test attributes are selected on the basis of a heuristic or statistical measure (e.g., information gain) Stopping conditions ‒ All samples for a given node belong to the same class ‒ There are no remaining attributes for further partitioning – majority voting is employed for classifying the leaf ‒ There are no samples left Decision Tree Algorithm
  • 26. How to choose An Attribute? • An attribute selection measure is a heuristic for selecting the splitting criterion that “best” separates a given data partition, D, of class labeled training tuples into individual classes. Ideally ‒ Each resulting partition would be pure ‒ A pure partition is a partition containing tuples that all belong to the same class • Attribute selection measures (splitting rules) ‒ Determine how the tuples at a given node are to be split ‒ Provide ranking for each attribute describing the tuples ‒ The attribute with highest score is chosen ‒ Determine a split point or a splitting subset • Methods – Information gain (ID3 (Iterative Dichotomiser 3) /C4.5) – Gain ratio – Gini Index (IBM IntelligentMiner) Attribute Selection Measures
  • 27. Before Describing Information Gain Entropy is a measure of the average information content one is missing when one does not know the value of the random variable. – Shannon's metric of "Entropy" of information is a foundational concept of information theory. – The entropy of a variable is the "amount of information" contained in the variable. High Entropy – X is from a uniform like distribution – Flat histogram – Values sampled from it are less predictable Low Entropy – X is from a varied (peaks and valleys) distribution – Histogram has many lows and highs – Values sampled from it are more predictable
  • 28. 1st approach: Information Gain Approach
  • 30. Assume there are two classes, P and N Let the set of examples D contain p elements of class P and n elements of class N The amount of information, needed to decide if an arbitrary example in D belongs to P or N is defined as Info(D) = np n np n np p np p npI     22 loglog),( Information Gain Approach log2x=log10x/log102
  • 32. Information Gain in Attribute
  • 33. • Assume that using attribute A a set D will be partitioned into sets {D1, D2 , …, Dv} – If D contains pi examples of P and ni examples of N, the entropy, or the expected information needed to classify objects in all subtrees Si is • The encoding information that would be gained by branching on A       1 ),()( i ii ii npI np np AE )(),()( AEnpIAGain  Information Gain in Attribute
  • 35. Information Gain in Attribute
  • 37. Extracting Classification Rules from Trees • Represent the knowledge in the form of IF-THEN rules • One rule is created for each path from the root to a leaf • Each attribute-value pair along a path forms a conjunction • The leaf node holds the class prediction • Rules are easier for humans to understand • Example IF age = “<=30” AND student = “no” THEN buys_computer = “no” IF age = “<=30” AND student = “yes” THEN buys_computer = “yes” IF age = “31…40” THEN buys_computer = “yes” IF age = “>40” AND credit_rating = “excellent” THEN buys_computer = “yes” IF age = “>40” AND credit_rating = “fair” THEN buys_computer = “no”
  • 38. Avoid Overfitting in Classification • The generated tree may overfit the training data – Too many branches, some may reflect anomalies due to noise or outliers – Result is in poor accuracy for unseen samples • Two approaches to avoid overfitting – Prepruning: Halt tree construction early—do not split a node if this would result in the goodness measure falling below a threshold • Difficult to choose an appropriate threshold – Postpruning: Remove branches from a “fully grown” tree—get a sequence of progressively pruned trees • Use a set of data different from the training data to decide which is the “best pruned tree”
  • 39. Approaches to Determine the Final Tree Size • Separate training (2/3) and testing (1/3) sets • Use cross validation, e.g., 10-fold cross validation • Use all the data for training – but apply a statistical test (e.g., chi-square) to estimate whether expanding or pruning a node may improve the entire distribution • Use minimum description length (MDL) principle: – halting growth of the tree when the encoding is
  • 40. Enhancements to basic decision tree induction • Allow for continuous-valued attributes – Dynamically define new discrete-valued attributes that partition the continuous attribute value into a discrete set of intervals • Handle missing attribute values – Assign the most common value of the attribute – Assign probability to each of the possible values • Attribute construction – Create new attributes based on existing ones that are sparsely represented – This reduces fragmentation, repetition, and replication
  • 41. Sore Throat Fever Swollen Glands Congestion Headache Diagnosis YES YES YES YES YES Strep Throat NO NO NO YES YES Allergy YES YES NO YES NO Cold YES NO YES NO NO Strep Throat NO YES NO YES NO Cold NO NO NO YES NO Allergy NO NO YES NO NO Strep Throat YES NO NO YES YES Allergy NO YES NO YES YES Cold YES YES NO YES YES Cold Exercise: For the following Medical Diagnosis Data, create a decision tree.
  • 42.   2 2 2 2 2 10 10 10 10 3 3 3 3 4 4 log log log 10 10 10 10 10 10 0.3log (0.3) 2 0.4log (0.4) log (0.3) log (0.4) 0.6 0.4 log 2 log 2 ( 0.522) ( 0.397) 0.6 0.4 0.301 0.301 0.6(1.73) 0.4(1. InfoGain                                               318) 1.038 0.5272 1.562   S=Strep Throat (3)+Allergy(3)+Cold(4)=10 Info(S)=1.562
  • 43. Finding Splitting Attribute • Select Attribute with highest Gain Sore Throat= Strep Throat Allergy Cold YES 2 1 2 NO 1 2 2 Information Gain x P Information Gain x P + = Entropy Sore Throat= 2 2 2 2 2 1 1 2 2 ( ) log log log 5 5 5 5 5 5 ( ) 1.52 Info YES Info YES                          2 2 2 1 1 2 2 2 2 ( ) log log log 5 5 5 5 5 5 ( ) 1.52 Info NO Info NO                          Entropy (E(Sore Throat)= P(YES)x1.52 + P(NO)x1.52 = (5/10)x1.52 + (5/10)x1.52 = 1.52 Gain (Sore Throat)= Info(S)-E(Sore Throat) = 1.562-1.52 = 0.05
  • 44. • Gain for each Attribute Attribute Gain Sore Throat 0.05 Fever 0.72 Swallen Glands 0.88 Congestion 0.45 Headache 0.05 Decision Tree Swallen Glands YesNo Diagnosis=Strep Throat Fever YesNo Diagnosis=ColdDiagnosis=Allergy IF Swallen Glands = “YES”, THEN Diagnosis=Strep Throat IF Swallen Glands = “NO” AND Fever = “YES”, THEN Diagnosis=Cold IF Swallen Glands = “NO” AND Fever = “NO”, THEN Diagnosis=Allergy