SlideShare a Scribd company logo
1 of 26
Classifier evaluation metrics

(ตัววัดประสิทธิภาพของโมเดลการจำแนกประเภทข้อมูล)
(data)3

base|warehouse|mining
http://www.dataminingtrend.com

http://facebook.com/datacube.th
Eakasit Pacharawongsakda, Ph.D.
Certified RapidMiner Analyst
facebook.com/datacube.th
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
Performance (classification)
• ตัววัดประสิทธิิภาพของโมเดล classification
• Confusion Matrix
• True Positive (TP), True Negative (TN)
• False Positive (FP), False Negative (FN)
• Precision and Recall
• Accuracy
• F-Measure
• ROC Graph & Area Under Curve (AUC)
2
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
Performance (classification)
• พิจารณาคลาส normal
• True Positive (TP)
• True Negative (TN)
• False Positive (FP)
• False Negative (FN)
3
ID Type Predicted
1 spam spam
2 spam spam
3 normal normal
4 normal spam
5 spam spam
6 spam spam
7 normal spam
8 spam normal
9 normal normal
10 normal normal
11 spam spam
12 spam spam
13 spam normal
14 spam normal
15 normal normal
ID Type Predicted
1 spam spam
2 spam spam
3 normal normal
4 normal spam
5 spam spam
6 spam spam
7 normal spam
8 spam normal
9 normal normal
10 normal normal
11 spam spam
12 spam spam
13 spam normal
14 spam normal
15 normal normal
pred.true. normal spam
normal TP FP
spam FN TN
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
Performance (classification)
• พิจารณาคลาส normal
• True Positive (TP)
• จำนวนที่ทำนายตรงกับข้อมูลจริงใน
คลาสที่กำลังพิจารณา
• True Negative (TN)
• False Positive (FP)
• False Negative (FN)
4
ID Type Predicted
1 spam spam
2 spam spam
3 normal normal
4 normal spam
5 spam spam
6 spam spam
7 normal spam
8 spam normal
9 normal normal
10 normal normal
11 spam spam
12 spam spam
13 spam normal
14 spam normal
15 normal normal
ID Type Predicted
1 spam spam
2 spam spam
3 normal normal
4 normal spam
5 spam spam
6 spam spam
7 normal spam
8 spam normal
9 normal normal
10 normal normal
11 spam spam
12 spam spam
13 spam normal
14 spam normal
15 normal normal
pred.true. normal spam
normal 4 FP
spam FN TN
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
Performance (classification)
• พิจารณาคลาส normal
• True Positive (TP)
• True Negative (TN)
• จำนวนที่ทำนายตรงกับข้อมูลจริงใน
คลาสที่ไม่ได้กำลังพิจารณา
• False Positive (FP)
• False Negative (FN)
5
ID Type Predicted
1 spam spam
2 spam spam
3 normal normal
4 normal spam
5 spam spam
6 spam spam
7 normal spam
8 spam normal
9 normal normal
10 normal normal
11 spam spam
12 spam spam
13 spam normal
14 spam normal
15 normal normal
ID Type Predicted
1 spam spam
2 spam spam
3 normal normal
4 normal spam
5 spam spam
6 spam spam
7 normal spam
8 spam normal
9 normal normal
10 normal normal
11 spam spam
12 spam spam
13 spam normal
14 spam normal
15 normal normal
pred.true. normal spam
normal 4 FP
spam FN 6
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
Performance (classification)
• พิจารณาคลาส normal
• True Positive (TP)
• True Negative (TN)
• False Positive (FP)
• จำนวนที่ทำนายผิดเป็นคลาสที่กำลัง
พิจารณา
• False Negative (FN)
6
ID Type Predicted
1 spam spam
2 spam spam
3 normal normal
4 normal spam
5 spam spam
6 spam spam
7 normal spam
8 spam normal
9 normal normal
10 normal normal
11 spam spam
12 spam spam
13 spam normal
14 spam normal
15 normal normal
ID Type Predicted
1 spam spam
2 spam spam
3 normal normal
4 normal spam
5 spam spam
6 spam spam
7 normal spam
8 spam normal
9 normal normal
10 normal normal
11 spam spam
12 spam spam
13 spam normal
14 spam normal
15 normal normal
pred.true. normal spam
normal 4 3
spam FN 6
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
Performance (classification)
• พิจารณาคลาส normal
• True Positive (TP)
• True Negative (TN)
• False Positive (FP)
• False Negative (FN)
• จำนวนที่ทำนายผิดเป็นคลาสที่ไม่ได้
กำลังพิจารณา
7
ID Type Predicted
1 spam spam
2 spam spam
3 normal normal
4 normal spam
5 spam spam
6 spam spam
7 normal spam
8 spam normal
9 normal normal
10 normal normal
11 spam spam
12 spam spam
13 spam normal
14 spam normal
15 normal normal
ID Type Predicted
1 spam spam
2 spam spam
3 normal normal
4 normal spam
5 spam spam
6 spam spam
7 normal spam
8 spam normal
9 normal normal
10 normal normal
11 spam spam
12 spam spam
13 spam normal
14 spam normal
15 normal normal
pred.true. normal spam
normal 4 3
spam 2 6
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
Performance (classification)
• ตัววัดประสิทธิิภาพของโมเดล classification
• Confusion Matrix
• True Positive (TP), True Negative (TN)
• False Positive (FP), False Negative (FN)
• Precision and Recall
• F-Measure
• Accuracy
• ROC Graph & Area Under Curve (AUC)
8
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
Performance (classification)
• Precision
• จำนวนที่ทำนายถูกจากข้อมูลที่
ทำนายว่าเป็นคลาสที่พิจารณาอยู่
• Precision สำหรับ normal
• True Positive

True Positive + False Positive
• 4/7 x 100 = 57.12%
• Precision สำหรับ spam
• 6/8 x 100 = 75%
9
ID Type Predicted
3 normal normal
8 spam normal
9 normal normal
10 normal normal
13 spam normal
14 spam normal
15 normal normal
pred.true. normal spam
normal TP FP
spam FN TN
Precision
ID Type Predicted
1 spam spam
2 spam spam
4 normal spam
5 spam spam
6 spam spam
7 normal spam
11 spam spam
12 spam spam
predict เป็นคลาส spam
predict เป็นคลาส normal
confusion matrix ของคลาส normal
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
Performance (classification)
• Recall
• จำนวนข้อมูลที่ทำนายถูก
• Recall สำหรับ normal
• True Positive

True Positive + False Negative
• 4/6 x 100 = 66.67%
• Recall สำหรับ spam
• 7/9 x 100 = 77.78%
10
pred.true. normal spam
normal TP FP
spam FN TN
คลาส spam
คลาส normal
confusion matrix ของคลาส normal
Recall
ID Type Predicted
3 normal normal
4 normal spam
7 normal spam
9 normal normal
10 normal normal
15 normal normal
ID Type Predicted
1 spam spam
2 spam spam
5 spam spam
6 spam spam
8 spam normal
11 spam spam
12 spam spam
13 spam normal
14 spam spam
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
Performance (classification)
• ตัววัดประสิทธิิภาพของโมเดล classification
• Confusion Matrix
• True Positive (TP), True Negative (TN)
• False Positive (FP), False Negative (FN)
• Precision and Recall
• F-Measure
• Accuracy
• ROC Graph & Area Under Curve (AUC)
11
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
Performance (classification)
• F-Measure
• ค่าเฉลี่ยของ Precision และ Recall
• 2 x Precision x Recall 

Precision + Recall
• F-Measure สำหรับ normal
• 2 x 57.12 x 66.67 = 61.53%

57.12 + 66.67
• F-Measure สำหรับ spam
• 2 x 75 x 77.8 = 76.37%

75 + 77.8
12
ID Type Predicted
3 normal normal
8 spam normal
9 normal normal
10 normal normal
13 spam normal
14 spam normal
15 normal normal
Precision = 4/7 x 100 = 57.12%
Recall = 4/6 x 100 = 66.67%
ID Type Predicted
3 normal normal
4 normal spam
7 normal spam
9 normal normal
10 normal normal
15 normal normal
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
Performance (classification)
• ตัววัดประสิทธิิภาพของโมเดล classification
• Confusion Matrix
• True Positive (TP), True Negative (TN)
• False Positive (FP), False Negative (FN)
• Precision and Recall
• F-Measure
• Accuracy
• ROC Graph & Area
13
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
Performance (classification)
• Accuracy
• จำนวนข้อมูลที่ทำนายถูกของทุก

คลาส
• True Positive + True Negative

True Positive + True Negative + False Positive + False Negative
• 10/15 x 100 =66.67%
14
pred.true. normal spam
normal TP FP
spam FN TN
Accuracy
ID Type Predicted
1 spam spam
2 spam spam
3 normal normal
4 normal spam
5 spam spam
6 spam spam
7 normal spam
8 spam normal
9 normal normal
10 normal normal
11 spam spam
12 spam spam
13 spam normal
14 spam normal
15 normal normal
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
Performance (classification)
• ตัววัดประสิทธิิภาพของโมเดล classification
• Confusion Matrix
• True Positive (TP), True Negative (TN)
• False Positive (FP), False Negative (FN)
• Precision and Recall
• F-Measure
• Accuracy
• ROC Graph & Area Under Curve (AUC)
15
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
ROC Graph & Area
• Receiver Operating Characteristics (ROC) แสดงกราฟความ
สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X)
16
ID Type Predicted Score TP rate FP rate
1 normal spam 0.80 1.00 1.00
2 spam spam 0.85 1.00 0.66
4 normal spam 0.87 0.80 0.66
5 spam spam 0.90 0.80 0.33
6 spam spam 0.92 0.60 0.33
7 normal spam 0.95 0.34 0.33
11 spam spam 0.98 0.40 0.00
12 spam spam 0.99 0.20 0.00
0.1 0.3 0.4 0.5 0.6 0.7
0.1
0.2
False Positive Rate (FP rate)
0.3
0.4
0.5
0.6
0.7
True Positive rate (TP rate)
0.2
0.8
0.9
1.0
0.8 0.9 1.0
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
ROC Graph & Area
• Receiver Operating Characteristics (ROC) แสดงกราฟความ
สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X)
17
ID Type Predicted Score TP rate FP rate
1 normal spam 0.80 1.00 1.00
2 spam spam 0.85 1.00 0.66
4 normal spam 0.87 0.80 0.66
5 spam spam 0.90 0.80 0.33
6 spam spam 0.92 0.60 0.33
7 normal spam 0.95 0.34 0.33
11 spam spam 0.98 0.40 0.00
12 spam spam 0.99 0.20 0.00
0.1 0.3 0.4 0.5 0.6 0.7
0.1
0.2
False Positive Rate (FP rate)
0.3
0.4
0.5
0.6
0.7
True Positive rate (TP rate)
0.2
0.8
0.9
1.0
0.8 0.9 1.0
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
ROC Graph & Area
• Receiver Operating Characteristics (ROC) แสดงกราฟความ
สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X)
18
ID Type Predicted Score TP rate FP rate
1 normal spam 0.80 1.00 1.00
2 spam spam 0.85 1.00 0.66
4 normal spam 0.87 0.80 0.66
5 spam spam 0.90 0.80 0.33
6 spam spam 0.92 0.60 0.33
7 normal spam 0.95 0.40 0.33
11 spam spam 0.98 0.40 0.00
12 spam spam 0.99 0.20 0.00
0.1 0.3 0.4 0.5 0.6 0.7
0.1
0.2
False Positive Rate (FP rate)
0.3
0.4
0.5
0.6
0.7
True Positive rate (TP rate)
0.2
0.8
0.9
1.0
0.8 0.9 1.0
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
ROC Graph & Area
• Receiver Operating Characteristics (ROC) แสดงกราฟความ
สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X)
19
ID Type Predicted Score TP rate FP rate
1 normal spam 0.80 1.00 1.00
2 spam spam 0.85 1.00 0.66
4 normal spam 0.87 0.80 0.66
5 spam spam 0.90 0.80 0.33
6 spam spam 0.92 0.60 0.33
7 normal spam 0.95 0.40 0.33
11 spam spam 0.98 0.40 0.00
12 spam spam 0.99 0.20 0.00
0.1 0.3 0.4 0.5 0.6 0.7
0.1
0.2
False Positive Rate (FP rate)
0.3
0.4
0.5
0.6
0.7
True Positive rate (TP rate)
0.2
0.8
0.9
1.0
0.8 0.9 1.0
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
ROC Graph & Area
• Receiver Operating Characteristics (ROC) แสดงกราฟความ
สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X)
20
ID Type Predicted Score TP rate FP rate
1 normal spam 0.80 1.00 1.00
2 spam spam 0.85 1.00 0.66
4 normal spam 0.87 0.80 0.66
5 spam spam 0.90 0.80 0.33
6 spam spam 0.92 0.60 0.33
7 normal spam 0.95 0.40 0.33
11 spam spam 0.98 0.40 0.00
12 spam spam 0.99 0.20 0.00
0.1 0.3 0.4 0.5 0.6 0.7
0.1
0.2
False Positive Rate (FP rate)
0.3
0.4
0.5
0.6
0.7
True Positive rate (TP rate)
0.2
0.8
0.9
1.0
0.8 0.9 1.0
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
ROC Graph & Area
• Receiver Operating Characteristics (ROC) แสดงกราฟความ
สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X)
21
ID Type Predicted Score TP rate FP rate
1 normal spam 0.80 1.00 1.00
2 spam spam 0.85 1.00 0.66
4 normal spam 0.87 0.80 0.66
5 spam spam 0.90 0.80 0.33
6 spam spam 0.92 0.60 0.33
7 normal spam 0.95 0.40 0.33
11 spam spam 0.98 0.40 0.00
12 spam spam 0.99 0.20 0.00
0.1 0.3 0.4 0.5 0.6 0.7
0.1
0.2
False Positive Rate (FP rate)
0.3
0.4
0.5
0.6
0.7
True Positive rate (TP rate)
0.2
0.8
0.9
1.0
0.8 0.9 1.0
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
ROC Graph & Area
• Receiver Operating Characteristics (ROC) แสดงกราฟความ
สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X)
22
ID Type Predicted Score TP rate FP rate
1 normal spam 0.80 1.00 1.00
2 spam spam 0.85 1.00 0.66
4 normal spam 0.87 0.80 0.66
5 spam spam 0.90 0.80 0.33
6 spam spam 0.92 0.60 0.33
7 normal spam 0.95 0.40 0.33
11 spam spam 0.98 0.40 0.00
12 spam spam 0.99 0.20 0.00
0.1 0.3 0.4 0.5 0.6 0.7
0.1
0.2
False Positive Rate (FP rate)
0.3
0.4
0.5
0.6
0.7
True Positive rate (TP rate)
0.2
0.8
0.9
1.0
0.8 0.9 1.0
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
ROC Graph & Area
• Receiver Operating Characteristics (ROC) แสดงกราฟความ
สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X)
23
ID Type Predicted Score TP rate FP rate
1 normal spam 0.80 1.00 1.00
2 spam spam 0.85 1.00 0.66
4 normal spam 0.87 0.80 0.66
5 spam spam 0.90 0.80 0.33
6 spam spam 0.92 0.60 0.33
7 normal spam 0.95 0.40 0.33
11 spam spam 0.98 0.40 0.00
12 spam spam 0.99 0.20 0.00
0.1 0.3 0.4 0.5 0.6 0.7
0.1
0.2
False Positive Rate (FP rate)
0.3
0.4
0.5
0.6
0.7
True Positive rate (TP rate)
0.2
0.8
0.9
1.0
0.8 0.9 1.0
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
ROC Graph & Area
• Receiver Operating Characteristics (ROC) แสดงกราฟความ
สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X)
24
ID Type Predicted Score TP rate FP rate
1 normal spam 0.80 1.00 1.00
2 spam spam 0.85 1.00 0.66
4 normal spam 0.87 0.80 0.66
5 spam spam 0.90 0.80 0.33
6 spam spam 0.92 0.60 0.33
7 normal spam 0.95 0.40 0.33
11 spam spam 0.98 0.40 0.00
12 spam spam 0.99 0.20 0.00
0.1 0.3 0.4 0.5 0.6 0.7
0.1
0.2
False Positive Rate (FP rate)
0.3
0.4
0.5
0.6
0.7
True Positive rate (TP rate)
0.2
0.8
0.9
1.0
0.8 0.9 1.0
ROC Curve
dataminingtrend.com
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
ROC Graph & Area
• ROC Curve มีค่าเข้าใกล้ 1 จะแสดงว่ามีประสิทธิภาพดี
• เนื่องจากมีค่า True Positive เยอะ
25
0.1 0.2 0.3 0.4 0.5 0.6 0.7
0.1
0.2
True Positive
False Positive
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0.8 0.9 1.0
The best
Good
Bad
dataminingtrend.com
(data)3

base|warehouse|mining
http://dataminingtrend.com http://facebook.com/datacube.th
ROC Graph & Area
• Area Under Curve (AUC) ใช้แสดงค่าพื้นที่ใต้กราฟ ROC
• มีค่ามาก (เข้าใกล้ 1) จะยิ่งดี
26
0.1 0.2 0.3 0.4 0.5 0.6 0.7
0.1
0.2
False Positive
0.3
0.4
0.5
0.6
0.7
True Positive
AUC
0.1 0.2 0.3 0.4 0.5 0.6 0.7
0.1
0.2
False Positive
0.3
0.4
0.5
0.6
0.7
True Positive
AUC
dataminingtrend.com
dataminingtrend.com

More Related Content

What's hot

Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning Mohammad Junaid Khan
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data miningKamal Acharya
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 
Lect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithmLect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithmhktripathy
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithmGangadhar S
 
Ensemble methods in machine learning
Ensemble methods in machine learningEnsemble methods in machine learning
Ensemble methods in machine learningSANTHOSH RAJA M G
 
Confusion Matrix
Confusion MatrixConfusion Matrix
Confusion MatrixRajat Gupta
 
Anomaly Detection Technique
Anomaly Detection TechniqueAnomaly Detection Technique
Anomaly Detection TechniqueChakrit Phain
 
Multiclass classification of imbalanced data
Multiclass classification of imbalanced dataMulticlass classification of imbalanced data
Multiclass classification of imbalanced dataSaurabhWani6
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning Gopal Sakarkar
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reductionmrizwan969
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningParas Kohli
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessingSalah Amean
 

What's hot (20)

Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Confusion Matrix Explained
Confusion Matrix ExplainedConfusion Matrix Explained
Confusion Matrix Explained
 
Decision trees in Machine Learning
Decision trees in Machine Learning Decision trees in Machine Learning
Decision trees in Machine Learning
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Lect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithmLect6 Association rule & Apriori algorithm
Lect6 Association rule & Apriori algorithm
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
Ensemble methods in machine learning
Ensemble methods in machine learningEnsemble methods in machine learning
Ensemble methods in machine learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Confusion Matrix
Confusion MatrixConfusion Matrix
Confusion Matrix
 
Anomaly Detection Technique
Anomaly Detection TechniqueAnomaly Detection Technique
Anomaly Detection Technique
 
Multiclass classification of imbalanced data
Multiclass classification of imbalanced dataMulticlass classification of imbalanced data
Multiclass classification of imbalanced data
 
Apriori Algorithm
Apriori AlgorithmApriori Algorithm
Apriori Algorithm
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
Learning from imbalanced data
Learning from imbalanced data Learning from imbalanced data
Learning from imbalanced data
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
 

Viewers also liked

Complement Naive Bayesの検証
Complement Naive Bayesの検証Complement Naive Bayesの検証
Complement Naive Bayesの検証moai kids
 
Text classification-php-v4
Text classification-php-v4Text classification-php-v4
Text classification-php-v4Glenn De Backer
 
Complement Naive Bayesによるワードの自動カテゴリ分類
Complement Naive Bayesによるワードの自動カテゴリ分類Complement Naive Bayesによるワードの自動カテゴリ分類
Complement Naive Bayesによるワードの自動カテゴリ分類moai kids
 
Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Kavita Ganesan
 

Viewers also liked (20)

Building Decision Tree model with numerical attributes
Building Decision Tree model with numerical attributesBuilding Decision Tree model with numerical attributes
Building Decision Tree model with numerical attributes
 
Practical Data Mining with RapidMiner Studio 7 : A Basic and Intermediate
Practical Data Mining with RapidMiner Studio 7 : A Basic and IntermediatePractical Data Mining with RapidMiner Studio 7 : A Basic and Intermediate
Practical Data Mining with RapidMiner Studio 7 : A Basic and Intermediate
 
05 classification 1 decision tree and rule based classification
05 classification 1 decision tree and rule based classification05 classification 1 decision tree and rule based classification
05 classification 1 decision tree and rule based classification
 
Search Twitter with RapidMiner Studio 6
Search Twitter with RapidMiner Studio 6Search Twitter with RapidMiner Studio 6
Search Twitter with RapidMiner Studio 6
 
Introduction to Feature (Attribute) Selection with RapidMiner Studio 6
Introduction to Feature (Attribute) Selection with RapidMiner Studio 6Introduction to Feature (Attribute) Selection with RapidMiner Studio 6
Introduction to Feature (Attribute) Selection with RapidMiner Studio 6
 
Introduction to Text Classification with RapidMiner Studio 7
Introduction to Text Classification with RapidMiner Studio 7Introduction to Text Classification with RapidMiner Studio 7
Introduction to Text Classification with RapidMiner Studio 7
 
Introduction to Weka: Application approach
Introduction to Weka: Application approachIntroduction to Weka: Application approach
Introduction to Weka: Application approach
 
Introduction to Data Mining and Big Data Analytics
Introduction to Data Mining and Big Data AnalyticsIntroduction to Data Mining and Big Data Analytics
Introduction to Data Mining and Big Data Analytics
 
Complement Naive Bayesの検証
Complement Naive Bayesの検証Complement Naive Bayesの検証
Complement Naive Bayesの検証
 
Text classification-php-v4
Text classification-php-v4Text classification-php-v4
Text classification-php-v4
 
Complement Naive Bayesによるワードの自動カテゴリ分類
Complement Naive Bayesによるワードの自動カテゴリ分類Complement Naive Bayesによるワードの自動カテゴリ分類
Complement Naive Bayesによるワードの自動カテゴリ分類
 
Introduction to Data Analytics with RapidMiner Studio 6 (ภาษาไทย)
Introduction to Data Analytics with RapidMiner Studio 6 (ภาษาไทย)Introduction to Data Analytics with RapidMiner Studio 6 (ภาษาไทย)
Introduction to Data Analytics with RapidMiner Studio 6 (ภาษาไทย)
 
Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)
 
Preprocessing with RapidMiner Studio 6
Preprocessing with RapidMiner Studio 6Preprocessing with RapidMiner Studio 6
Preprocessing with RapidMiner Studio 6
 
Data manipulation with RapidMiner Studio 7
Data manipulation with RapidMiner Studio 7Data manipulation with RapidMiner Studio 7
Data manipulation with RapidMiner Studio 7
 
Data mining and_big_data_web
Data mining and_big_data_webData mining and_big_data_web
Data mining and_big_data_web
 
การติดตั้ง RapidMiner Studio 6.1
การติดตั้ง RapidMiner Studio 6.1การติดตั้ง RapidMiner Studio 6.1
การติดตั้ง RapidMiner Studio 6.1
 
Install weka extension_rapidminer
Install weka extension_rapidminerInstall weka extension_rapidminer
Install weka extension_rapidminer
 
Predictive analytic-for-retail-business
Predictive analytic-for-retail-businessPredictive analytic-for-retail-business
Predictive analytic-for-retail-business
 
Introduction to Predictive Analytics with case studies
Introduction to Predictive Analytics with case studiesIntroduction to Predictive Analytics with case studies
Introduction to Predictive Analytics with case studies
 

More from Big Data Engineering, Faculty of Engineering, Dhurakij Pundit University (7)

Practical Data Science 
Use-cases in Retail & eCommerce
Practical Data Science 
Use-cases in Retail & eCommercePractical Data Science 
Use-cases in Retail & eCommerce
Practical Data Science 
Use-cases in Retail & eCommerce
 
First Step to Big Data
First Step to Big DataFirst Step to Big Data
First Step to Big Data
 
Introduction to Big Data Technologies
Introduction to Big Data TechnologiesIntroduction to Big Data Technologies
Introduction to Big Data Technologies
 
Introduction to Data Mining and Big Data Analytics
Introduction to Data Mining and Big Data AnalyticsIntroduction to Data Mining and Big Data Analytics
Introduction to Data Mining and Big Data Analytics
 
Apply (Big) Data Analytics & Predictive Analytics to Business Application
Apply (Big) Data Analytics & Predictive Analytics to Business ApplicationApply (Big) Data Analytics & Predictive Analytics to Business Application
Apply (Big) Data Analytics & Predictive Analytics to Business Application
 
Advanced Predictive Modeling with R and RapidMiner Studio 7
Advanced Predictive Modeling with R and RapidMiner Studio 7Advanced Predictive Modeling with R and RapidMiner Studio 7
Advanced Predictive Modeling with R and RapidMiner Studio 7
 
Practical Data Mining: FP-Growth
Practical Data Mining: FP-GrowthPractical Data Mining: FP-Growth
Practical Data Mining: FP-Growth
 

Evaluation metrics: Precision, Recall, F-Measure, ROC

  • 2. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th Performance (classification) • ตัววัดประสิทธิิภาพของโมเดล classification • Confusion Matrix • True Positive (TP), True Negative (TN) • False Positive (FP), False Negative (FN) • Precision and Recall • Accuracy • F-Measure • ROC Graph & Area Under Curve (AUC) 2
  • 3. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th Performance (classification) • พิจารณาคลาส normal • True Positive (TP) • True Negative (TN) • False Positive (FP) • False Negative (FN) 3 ID Type Predicted 1 spam spam 2 spam spam 3 normal normal 4 normal spam 5 spam spam 6 spam spam 7 normal spam 8 spam normal 9 normal normal 10 normal normal 11 spam spam 12 spam spam 13 spam normal 14 spam normal 15 normal normal ID Type Predicted 1 spam spam 2 spam spam 3 normal normal 4 normal spam 5 spam spam 6 spam spam 7 normal spam 8 spam normal 9 normal normal 10 normal normal 11 spam spam 12 spam spam 13 spam normal 14 spam normal 15 normal normal pred.true. normal spam normal TP FP spam FN TN dataminingtrend.com dataminingtrend.com
  • 4. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th Performance (classification) • พิจารณาคลาส normal • True Positive (TP) • จำนวนที่ทำนายตรงกับข้อมูลจริงใน คลาสที่กำลังพิจารณา • True Negative (TN) • False Positive (FP) • False Negative (FN) 4 ID Type Predicted 1 spam spam 2 spam spam 3 normal normal 4 normal spam 5 spam spam 6 spam spam 7 normal spam 8 spam normal 9 normal normal 10 normal normal 11 spam spam 12 spam spam 13 spam normal 14 spam normal 15 normal normal ID Type Predicted 1 spam spam 2 spam spam 3 normal normal 4 normal spam 5 spam spam 6 spam spam 7 normal spam 8 spam normal 9 normal normal 10 normal normal 11 spam spam 12 spam spam 13 spam normal 14 spam normal 15 normal normal pred.true. normal spam normal 4 FP spam FN TN dataminingtrend.com dataminingtrend.com
  • 5. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th Performance (classification) • พิจารณาคลาส normal • True Positive (TP) • True Negative (TN) • จำนวนที่ทำนายตรงกับข้อมูลจริงใน คลาสที่ไม่ได้กำลังพิจารณา • False Positive (FP) • False Negative (FN) 5 ID Type Predicted 1 spam spam 2 spam spam 3 normal normal 4 normal spam 5 spam spam 6 spam spam 7 normal spam 8 spam normal 9 normal normal 10 normal normal 11 spam spam 12 spam spam 13 spam normal 14 spam normal 15 normal normal ID Type Predicted 1 spam spam 2 spam spam 3 normal normal 4 normal spam 5 spam spam 6 spam spam 7 normal spam 8 spam normal 9 normal normal 10 normal normal 11 spam spam 12 spam spam 13 spam normal 14 spam normal 15 normal normal pred.true. normal spam normal 4 FP spam FN 6 dataminingtrend.com dataminingtrend.com
  • 6. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th Performance (classification) • พิจารณาคลาส normal • True Positive (TP) • True Negative (TN) • False Positive (FP) • จำนวนที่ทำนายผิดเป็นคลาสที่กำลัง พิจารณา • False Negative (FN) 6 ID Type Predicted 1 spam spam 2 spam spam 3 normal normal 4 normal spam 5 spam spam 6 spam spam 7 normal spam 8 spam normal 9 normal normal 10 normal normal 11 spam spam 12 spam spam 13 spam normal 14 spam normal 15 normal normal ID Type Predicted 1 spam spam 2 spam spam 3 normal normal 4 normal spam 5 spam spam 6 spam spam 7 normal spam 8 spam normal 9 normal normal 10 normal normal 11 spam spam 12 spam spam 13 spam normal 14 spam normal 15 normal normal pred.true. normal spam normal 4 3 spam FN 6 dataminingtrend.com dataminingtrend.com
  • 7. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th Performance (classification) • พิจารณาคลาส normal • True Positive (TP) • True Negative (TN) • False Positive (FP) • False Negative (FN) • จำนวนที่ทำนายผิดเป็นคลาสที่ไม่ได้ กำลังพิจารณา 7 ID Type Predicted 1 spam spam 2 spam spam 3 normal normal 4 normal spam 5 spam spam 6 spam spam 7 normal spam 8 spam normal 9 normal normal 10 normal normal 11 spam spam 12 spam spam 13 spam normal 14 spam normal 15 normal normal ID Type Predicted 1 spam spam 2 spam spam 3 normal normal 4 normal spam 5 spam spam 6 spam spam 7 normal spam 8 spam normal 9 normal normal 10 normal normal 11 spam spam 12 spam spam 13 spam normal 14 spam normal 15 normal normal pred.true. normal spam normal 4 3 spam 2 6 dataminingtrend.com dataminingtrend.com
  • 8. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th Performance (classification) • ตัววัดประสิทธิิภาพของโมเดล classification • Confusion Matrix • True Positive (TP), True Negative (TN) • False Positive (FP), False Negative (FN) • Precision and Recall • F-Measure • Accuracy • ROC Graph & Area Under Curve (AUC) 8
  • 9. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th Performance (classification) • Precision • จำนวนที่ทำนายถูกจากข้อมูลที่ ทำนายว่าเป็นคลาสที่พิจารณาอยู่ • Precision สำหรับ normal • True Positive
 True Positive + False Positive • 4/7 x 100 = 57.12% • Precision สำหรับ spam • 6/8 x 100 = 75% 9 ID Type Predicted 3 normal normal 8 spam normal 9 normal normal 10 normal normal 13 spam normal 14 spam normal 15 normal normal pred.true. normal spam normal TP FP spam FN TN Precision ID Type Predicted 1 spam spam 2 spam spam 4 normal spam 5 spam spam 6 spam spam 7 normal spam 11 spam spam 12 spam spam predict เป็นคลาส spam predict เป็นคลาส normal confusion matrix ของคลาส normal dataminingtrend.com dataminingtrend.com
  • 10. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th Performance (classification) • Recall • จำนวนข้อมูลที่ทำนายถูก • Recall สำหรับ normal • True Positive
 True Positive + False Negative • 4/6 x 100 = 66.67% • Recall สำหรับ spam • 7/9 x 100 = 77.78% 10 pred.true. normal spam normal TP FP spam FN TN คลาส spam คลาส normal confusion matrix ของคลาส normal Recall ID Type Predicted 3 normal normal 4 normal spam 7 normal spam 9 normal normal 10 normal normal 15 normal normal ID Type Predicted 1 spam spam 2 spam spam 5 spam spam 6 spam spam 8 spam normal 11 spam spam 12 spam spam 13 spam normal 14 spam spam dataminingtrend.com dataminingtrend.com
  • 11. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th Performance (classification) • ตัววัดประสิทธิิภาพของโมเดล classification • Confusion Matrix • True Positive (TP), True Negative (TN) • False Positive (FP), False Negative (FN) • Precision and Recall • F-Measure • Accuracy • ROC Graph & Area Under Curve (AUC) 11
  • 12. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th Performance (classification) • F-Measure • ค่าเฉลี่ยของ Precision และ Recall • 2 x Precision x Recall 
 Precision + Recall • F-Measure สำหรับ normal • 2 x 57.12 x 66.67 = 61.53%
 57.12 + 66.67 • F-Measure สำหรับ spam • 2 x 75 x 77.8 = 76.37%
 75 + 77.8 12 ID Type Predicted 3 normal normal 8 spam normal 9 normal normal 10 normal normal 13 spam normal 14 spam normal 15 normal normal Precision = 4/7 x 100 = 57.12% Recall = 4/6 x 100 = 66.67% ID Type Predicted 3 normal normal 4 normal spam 7 normal spam 9 normal normal 10 normal normal 15 normal normal dataminingtrend.com
  • 13. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th Performance (classification) • ตัววัดประสิทธิิภาพของโมเดล classification • Confusion Matrix • True Positive (TP), True Negative (TN) • False Positive (FP), False Negative (FN) • Precision and Recall • F-Measure • Accuracy • ROC Graph & Area 13
  • 14. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th Performance (classification) • Accuracy • จำนวนข้อมูลที่ทำนายถูกของทุก
 คลาส • True Positive + True Negative
 True Positive + True Negative + False Positive + False Negative • 10/15 x 100 =66.67% 14 pred.true. normal spam normal TP FP spam FN TN Accuracy ID Type Predicted 1 spam spam 2 spam spam 3 normal normal 4 normal spam 5 spam spam 6 spam spam 7 normal spam 8 spam normal 9 normal normal 10 normal normal 11 spam spam 12 spam spam 13 spam normal 14 spam normal 15 normal normal dataminingtrend.com dataminingtrend.com
  • 15. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th Performance (classification) • ตัววัดประสิทธิิภาพของโมเดล classification • Confusion Matrix • True Positive (TP), True Negative (TN) • False Positive (FP), False Negative (FN) • Precision and Recall • F-Measure • Accuracy • ROC Graph & Area Under Curve (AUC) 15
  • 16. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th ROC Graph & Area • Receiver Operating Characteristics (ROC) แสดงกราฟความ สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X) 16 ID Type Predicted Score TP rate FP rate 1 normal spam 0.80 1.00 1.00 2 spam spam 0.85 1.00 0.66 4 normal spam 0.87 0.80 0.66 5 spam spam 0.90 0.80 0.33 6 spam spam 0.92 0.60 0.33 7 normal spam 0.95 0.34 0.33 11 spam spam 0.98 0.40 0.00 12 spam spam 0.99 0.20 0.00 0.1 0.3 0.4 0.5 0.6 0.7 0.1 0.2 False Positive Rate (FP rate) 0.3 0.4 0.5 0.6 0.7 True Positive rate (TP rate) 0.2 0.8 0.9 1.0 0.8 0.9 1.0 dataminingtrend.com dataminingtrend.com
  • 17. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th ROC Graph & Area • Receiver Operating Characteristics (ROC) แสดงกราฟความ สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X) 17 ID Type Predicted Score TP rate FP rate 1 normal spam 0.80 1.00 1.00 2 spam spam 0.85 1.00 0.66 4 normal spam 0.87 0.80 0.66 5 spam spam 0.90 0.80 0.33 6 spam spam 0.92 0.60 0.33 7 normal spam 0.95 0.34 0.33 11 spam spam 0.98 0.40 0.00 12 spam spam 0.99 0.20 0.00 0.1 0.3 0.4 0.5 0.6 0.7 0.1 0.2 False Positive Rate (FP rate) 0.3 0.4 0.5 0.6 0.7 True Positive rate (TP rate) 0.2 0.8 0.9 1.0 0.8 0.9 1.0 dataminingtrend.com dataminingtrend.com
  • 18. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th ROC Graph & Area • Receiver Operating Characteristics (ROC) แสดงกราฟความ สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X) 18 ID Type Predicted Score TP rate FP rate 1 normal spam 0.80 1.00 1.00 2 spam spam 0.85 1.00 0.66 4 normal spam 0.87 0.80 0.66 5 spam spam 0.90 0.80 0.33 6 spam spam 0.92 0.60 0.33 7 normal spam 0.95 0.40 0.33 11 spam spam 0.98 0.40 0.00 12 spam spam 0.99 0.20 0.00 0.1 0.3 0.4 0.5 0.6 0.7 0.1 0.2 False Positive Rate (FP rate) 0.3 0.4 0.5 0.6 0.7 True Positive rate (TP rate) 0.2 0.8 0.9 1.0 0.8 0.9 1.0 dataminingtrend.com dataminingtrend.com
  • 19. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th ROC Graph & Area • Receiver Operating Characteristics (ROC) แสดงกราฟความ สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X) 19 ID Type Predicted Score TP rate FP rate 1 normal spam 0.80 1.00 1.00 2 spam spam 0.85 1.00 0.66 4 normal spam 0.87 0.80 0.66 5 spam spam 0.90 0.80 0.33 6 spam spam 0.92 0.60 0.33 7 normal spam 0.95 0.40 0.33 11 spam spam 0.98 0.40 0.00 12 spam spam 0.99 0.20 0.00 0.1 0.3 0.4 0.5 0.6 0.7 0.1 0.2 False Positive Rate (FP rate) 0.3 0.4 0.5 0.6 0.7 True Positive rate (TP rate) 0.2 0.8 0.9 1.0 0.8 0.9 1.0 dataminingtrend.com dataminingtrend.com
  • 20. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th ROC Graph & Area • Receiver Operating Characteristics (ROC) แสดงกราฟความ สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X) 20 ID Type Predicted Score TP rate FP rate 1 normal spam 0.80 1.00 1.00 2 spam spam 0.85 1.00 0.66 4 normal spam 0.87 0.80 0.66 5 spam spam 0.90 0.80 0.33 6 spam spam 0.92 0.60 0.33 7 normal spam 0.95 0.40 0.33 11 spam spam 0.98 0.40 0.00 12 spam spam 0.99 0.20 0.00 0.1 0.3 0.4 0.5 0.6 0.7 0.1 0.2 False Positive Rate (FP rate) 0.3 0.4 0.5 0.6 0.7 True Positive rate (TP rate) 0.2 0.8 0.9 1.0 0.8 0.9 1.0 dataminingtrend.com dataminingtrend.com
  • 21. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th ROC Graph & Area • Receiver Operating Characteristics (ROC) แสดงกราฟความ สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X) 21 ID Type Predicted Score TP rate FP rate 1 normal spam 0.80 1.00 1.00 2 spam spam 0.85 1.00 0.66 4 normal spam 0.87 0.80 0.66 5 spam spam 0.90 0.80 0.33 6 spam spam 0.92 0.60 0.33 7 normal spam 0.95 0.40 0.33 11 spam spam 0.98 0.40 0.00 12 spam spam 0.99 0.20 0.00 0.1 0.3 0.4 0.5 0.6 0.7 0.1 0.2 False Positive Rate (FP rate) 0.3 0.4 0.5 0.6 0.7 True Positive rate (TP rate) 0.2 0.8 0.9 1.0 0.8 0.9 1.0 dataminingtrend.com dataminingtrend.com
  • 22. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th ROC Graph & Area • Receiver Operating Characteristics (ROC) แสดงกราฟความ สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X) 22 ID Type Predicted Score TP rate FP rate 1 normal spam 0.80 1.00 1.00 2 spam spam 0.85 1.00 0.66 4 normal spam 0.87 0.80 0.66 5 spam spam 0.90 0.80 0.33 6 spam spam 0.92 0.60 0.33 7 normal spam 0.95 0.40 0.33 11 spam spam 0.98 0.40 0.00 12 spam spam 0.99 0.20 0.00 0.1 0.3 0.4 0.5 0.6 0.7 0.1 0.2 False Positive Rate (FP rate) 0.3 0.4 0.5 0.6 0.7 True Positive rate (TP rate) 0.2 0.8 0.9 1.0 0.8 0.9 1.0 dataminingtrend.com dataminingtrend.com
  • 23. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th ROC Graph & Area • Receiver Operating Characteristics (ROC) แสดงกราฟความ สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X) 23 ID Type Predicted Score TP rate FP rate 1 normal spam 0.80 1.00 1.00 2 spam spam 0.85 1.00 0.66 4 normal spam 0.87 0.80 0.66 5 spam spam 0.90 0.80 0.33 6 spam spam 0.92 0.60 0.33 7 normal spam 0.95 0.40 0.33 11 spam spam 0.98 0.40 0.00 12 spam spam 0.99 0.20 0.00 0.1 0.3 0.4 0.5 0.6 0.7 0.1 0.2 False Positive Rate (FP rate) 0.3 0.4 0.5 0.6 0.7 True Positive rate (TP rate) 0.2 0.8 0.9 1.0 0.8 0.9 1.0 dataminingtrend.com dataminingtrend.com
  • 24. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th ROC Graph & Area • Receiver Operating Characteristics (ROC) แสดงกราฟความ สัมพันธ์ระหว่างข้อมูลที่ทำนายถูก (แกน Y) และทำนายผิด (แกน X) 24 ID Type Predicted Score TP rate FP rate 1 normal spam 0.80 1.00 1.00 2 spam spam 0.85 1.00 0.66 4 normal spam 0.87 0.80 0.66 5 spam spam 0.90 0.80 0.33 6 spam spam 0.92 0.60 0.33 7 normal spam 0.95 0.40 0.33 11 spam spam 0.98 0.40 0.00 12 spam spam 0.99 0.20 0.00 0.1 0.3 0.4 0.5 0.6 0.7 0.1 0.2 False Positive Rate (FP rate) 0.3 0.4 0.5 0.6 0.7 True Positive rate (TP rate) 0.2 0.8 0.9 1.0 0.8 0.9 1.0 ROC Curve dataminingtrend.com dataminingtrend.com
  • 25. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th ROC Graph & Area • ROC Curve มีค่าเข้าใกล้ 1 จะแสดงว่ามีประสิทธิภาพดี • เนื่องจากมีค่า True Positive เยอะ 25 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 True Positive False Positive 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.8 0.9 1.0 The best Good Bad dataminingtrend.com
  • 26. (data)3
 base|warehouse|mining http://dataminingtrend.com http://facebook.com/datacube.th ROC Graph & Area • Area Under Curve (AUC) ใช้แสดงค่าพื้นที่ใต้กราฟ ROC • มีค่ามาก (เข้าใกล้ 1) จะยิ่งดี 26 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 False Positive 0.3 0.4 0.5 0.6 0.7 True Positive AUC 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.1 0.2 False Positive 0.3 0.4 0.5 0.6 0.7 True Positive AUC dataminingtrend.com dataminingtrend.com