SlideShare uma empresa Scribd logo
1 de 35
Baixar para ler offline
Chapter 5Data mining : A Closer Look
Data Warehouse and Data Mining Chapter 5 2 
Chapter Objectives Determine an appropriate data mining strategy for a specific problem. Know about several data mining techniques and how each technique builds a generalized model to represent data. Understand how a confusion matrix is used to help evaluate supervised learner models.
Data Warehouse and Data Mining Chapter 5 3 
Understand basic techniques for evaluating supervised learner models with numeric output. Know how measuring lift can be used to compare the performance of several competing supervised learner models. Understand basic techniques for evaluating unsupervised learner models. Chapter Objectives
Data Warehouse and Data Mining Chapter 5 4 
Data Mining StrategiesClassificationis probably the best understood of all data mining strategies. Classification tasks have three common characteristics. •Learning is supervised. •The dependent variable is categorical. •The emphasis is on building modelsable to assign new instances to one of a set of well- defined classes.
Data Warehouse and Data Mining Chapter 5 5 
Data Mining Strategies•Some example classification tasks include the following: •Determine those characteristics that differentiate individuals who have suffered a heart attack from those who have not. •Develop a profile of a “successful” person. •Determine if a credit card purchase is fraudulent. •Classify a car loan applicant as a good or a poor credit risk. •Develop a profile to differentiate female and male stroke victims.
Data Warehouse and Data Mining Chapter 5 6 
Data Mining Strategies
Data Warehouse and Data Mining Chapter 5 7 
Data Mining Strategies
Data Warehouse and Data Mining Chapter 5 8 
Data Mining Strategies
Data Warehouse and Data Mining Chapter 5 9 
Data Mining Strategies
Data Warehouse and Data Mining Chapter 5 10 
Data Mining Strategies 
34% are healthy within these 
max heart rate range
Data Warehouse and Data Mining Chapter 5 11 
Supervised Data Mining Techniques
Data Warehouse and Data Mining Chapter 5 12 
Supervised Data Mining Techniques
Data Warehouse and Data Mining Chapter 5 13 
Supervised Data Mining Techniques
Data Warehouse and Data Mining Chapter 5 14 
Supervised Data Mining Techniques
Data Warehouse and Data Mining Chapter 5 15 
Supervised Data Mining Techniques
Data Warehouse and Data Mining Chapter 5 16 
Association Rules
Data Warehouse and Data Mining Chapter 5 17 
Clustering Techniques
Data Warehouse and Data Mining Chapter 5 18 
Clustering Techniques
Data Warehouse and Data Mining Chapter 5 19 
Evaluating Performance
Data Warehouse and Data Mining Chapter 5 20 
Evaluating Performance
Data Warehouse and Data Mining Chapter 5 21 
Evaluating Performance
Data Warehouse and Data Mining Chapter 5 22 
Evaluating Performance
Data Warehouse and Data Mining Chapter 5 23 
Evaluating Performance
Data Warehouse and Data Mining Chapter 5 24 
Chapter SummaryData mining strategies include classification, estimation, prediction, unsupervised clustering, and market basket analysis. Classification and estimation strategies are similar in that each strategy is employed to build models able to generalize current outcome. However, the output of a classification strategy is categorical, whereas the output of an estimation strategy is numeric.
Data Warehouse and Data Mining Chapter 5 25 
Chapter SummaryA predictive strategydiffers from a classification or estimation strategy in that it is used to design models for predicting future outcome rather than current behavior. Unsupervised clusteringstrategies are employed to discover hidden concept structures in data as well as to locate atypical data instances. The purpose of market basket analysisis to find interesting relationships among retail products. Discovered relationships can be used to design promotions, arrange shelf or catalog items, or develop cross- marketing strategies.
Data Warehouse and Data Mining Chapter 5 26 
A data mining technique applies a data mining strategy to a set of data. Data mining techniques are defined by an algorithm and a knowledge structure. Common features that distinguish the various techniques are whether learning is supervised or unsupervised and whether theiroutput is categorical or numeric. Chapter Summary
Data Warehouse and Data Mining Chapter 5 27 
Familiar supervised data miningtechniques include decision tree methods, production rule generators, neural networks, and statistical methods. Association rules are a favorite technique for marketing applications. Clustering techniques employ some measure of similarity to group instancesinto disjoint partitions. Clustering methods are frequently used to help determine a best set of input attributes for building supervised learner models. Chapter Summary
Data Warehouse and Data Mining Chapter 5 28 
Chapter SummaryPerformance evaluationis probably the most critical of all the steps in the data mining process. Supervised model evaluation is often performed using a training/test set scenario. Supervised models with numeric output can be evaluated by computing average absolute or average squared error differences between computed and desired outcome.
Data Warehouse and Data Mining Chapter 5 29 
Chapter SummaryMarketing applications that focus on mass mailings are interested in developing models for increasing response rates to promotions. A marketing application measures the goodness of a model by its ability to lift response rate thresholds to levels well above those achieved by naïve (mass) mailing strategies. Unsupervised models support some measure of cluster qualitythat can be used for evaluative purposes. Supervised learning can also be employed to evaluate the quality of the clusters formedby an unsupervised model.
Data Warehouse and Data Mining Chapter 5 30 
Key TermsClassification. A supervised learning strategy where the output attribute is categorical. The emphasis is on building models able to assign new instances to one of a set of well-defined classes. Association rule.A production rule whose consequent may contain multiple conditions and attribute relationships. An output attribute in one association rule can be an input attribute in other rule. Confusion matrix.A matrix used to summarize the results of a supervised classification. Entries along the main diagonal represent the total number of correct classifications. Entries other than those on the main diagonal represent classification errors.
Data Warehouse and Data Mining Chapter 5 31 
Key TermsDatamining strategy.An outline of an approach for problem solution. Data mining technique.One or more algorithms together with an associated knowledge structure. Dependent variable.A variable whose value is determined by a combination of one or more independent variables. Estimation.A supervised learning strategy where the output attribute is numeric. Emphasis is on determining current rather than future outcome.
Data Warehouse and Data Mining Chapter 5 32 
Key TermsIndependent variable.An input attribute used for building supervised or unsupervised learner models. Lift.The probability of class Cigiven a sample taken from population Pdivided by the probability of Cigiven the entire population P. Lift chart.A graph that displays the performance of a data mining model as a function of sample size. Linear regression.A supervised learning technique that generalizes numeric data as a linear equation. The equation defines the value of an output attribute as a linear sum of weighted input attribute values.
Data Warehouse and Data Mining Chapter 5 33 
Key TermsMarket basket analysis.A data mining strategy that attempts to find interesting relationships among retail products. Mean absolute error.For a set of training or test set instances, the mean absolute error is the average absolute difference between classifier predicted output and actual output. Mean squared error.For a set of training or test set instances, the mean squared error is the average of the sum of squared differences between classifier predicted output and actual output. Neural network.A set of interconnected nodes designed to imitate the functioning of the human brain.
Data Warehouse and Data Mining Chapter 5 34 
Key TermsOutliers.Atypical data instances. Prediction.A supervised learning strategy designed to determine future outcome. Root mean squared error.The square root of the mean squared error. Rule Maker.A supervised learner model for generating production rules from data. Statistical regression.A supervised learning technique that generalizes numerical data as a mathematical equation. The equation defines the value of an output attribute as a sum of weighted input attribute values.
Data Warehouse and Data Mining Chapter 5 35

Mais conteúdo relacionado

Mais procurados

The 8 Step Data Mining Process
The 8 Step Data Mining ProcessThe 8 Step Data Mining Process
The 8 Step Data Mining ProcessMarc Berman
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPranov Mishra
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data miningHoang Nguyen
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term depositPranov Mishra
 
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...ijaia
 
APPLICATION OF FACEBOOK'S PROPHET ALGORITHM FOR SUCCESSFUL SALES FORECASTING ...
APPLICATION OF FACEBOOK'S PROPHET ALGORITHM FOR SUCCESSFUL SALES FORECASTING ...APPLICATION OF FACEBOOK'S PROPHET ALGORITHM FOR SUCCESSFUL SALES FORECASTING ...
APPLICATION OF FACEBOOK'S PROPHET ALGORITHM FOR SUCCESSFUL SALES FORECASTING ...ijnlc
 
LINK MINING PROCESS
LINK MINING PROCESSLINK MINING PROCESS
LINK MINING PROCESSIJDKP
 
A PROCESS OF LINK MINING
A PROCESS OF LINK MININGA PROCESS OF LINK MINING
A PROCESS OF LINK MININGcsandit
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data miningUjjawal
 

Mais procurados (13)

Ch35
Ch35Ch35
Ch35
 
Node JS Training in Bangalore Classroom, Online myTectra
Node JS Training in Bangalore Classroom, Online myTectraNode JS Training in Bangalore Classroom, Online myTectra
Node JS Training in Bangalore Classroom, Online myTectra
 
The 8 Step Data Mining Process
The 8 Step Data Mining ProcessThe 8 Step Data Mining Process
The 8 Step Data Mining Process
 
Seminar Presentation
Seminar PresentationSeminar Presentation
Seminar Presentation
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom Industry
 
7 steps to Predictive Analytics
7 steps to Predictive Analytics 7 steps to Predictive Analytics
7 steps to Predictive Analytics
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term deposit
 
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...
 
APPLICATION OF FACEBOOK'S PROPHET ALGORITHM FOR SUCCESSFUL SALES FORECASTING ...
APPLICATION OF FACEBOOK'S PROPHET ALGORITHM FOR SUCCESSFUL SALES FORECASTING ...APPLICATION OF FACEBOOK'S PROPHET ALGORITHM FOR SUCCESSFUL SALES FORECASTING ...
APPLICATION OF FACEBOOK'S PROPHET ALGORITHM FOR SUCCESSFUL SALES FORECASTING ...
 
LINK MINING PROCESS
LINK MINING PROCESSLINK MINING PROCESS
LINK MINING PROCESS
 
A PROCESS OF LINK MINING
A PROCESS OF LINK MININGA PROCESS OF LINK MINING
A PROCESS OF LINK MINING
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data mining
 

Semelhante a Data Mining Techniques Explained

Data Mining methodology
 Data Mining methodology  Data Mining methodology
Data Mining methodology rebeccatho
 
Top 20 Data Science Interview Questions and Answers in 2023.pptx
Top 20 Data Science Interview Questions and Answers in 2023.pptxTop 20 Data Science Interview Questions and Answers in 2023.pptx
Top 20 Data Science Interview Questions and Answers in 2023.pptxAnanthReddy38
 
Data analytics and visualization
Data analytics and visualizationData analytics and visualization
Data analytics and visualizationVini Vasundharan
 
Top 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdfTop 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdfAnanthReddy38
 
Clustering customer data dr sankar rajagopal
Clustering customer data   dr sankar rajagopalClustering customer data   dr sankar rajagopal
Clustering customer data dr sankar rajagopalDr.Sankar Rajagopal
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxiaeronlineexm
 
Types of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTypes of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTanvir Moin
 
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxLesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxcloudserviceuit
 
Data Analyst Interview Questions & Answers
Data Analyst Interview Questions & AnswersData Analyst Interview Questions & Answers
Data Analyst Interview Questions & AnswersSatyam Jaiswal
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applicationsBenjaminlapid1
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10Roger Barga
 
For iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptxFor iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptxSureshPolisetty2
 
Data Driven Engineering 2014
Data Driven Engineering 2014Data Driven Engineering 2014
Data Driven Engineering 2014Roger Barga
 
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...IJDKP
 
Store segmentation progresso
Store segmentation progressoStore segmentation progresso
Store segmentation progressoveesingh
 

Semelhante a Data Mining Techniques Explained (20)

69.pdf
69.pdf69.pdf
69.pdf
 
Data Mining methodology
 Data Mining methodology  Data Mining methodology
Data Mining methodology
 
Top 20 Data Science Interview Questions and Answers in 2023.pptx
Top 20 Data Science Interview Questions and Answers in 2023.pptxTop 20 Data Science Interview Questions and Answers in 2023.pptx
Top 20 Data Science Interview Questions and Answers in 2023.pptx
 
Data analytics and visualization
Data analytics and visualizationData analytics and visualization
Data analytics and visualization
 
Top 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdfTop 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdf
 
Clustering customer data dr sankar rajagopal
Clustering customer data   dr sankar rajagopalClustering customer data   dr sankar rajagopal
Clustering customer data dr sankar rajagopal
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptx
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
KDD assignmnt data.docx
KDD assignmnt data.docxKDD assignmnt data.docx
KDD assignmnt data.docx
 
Types of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTypes of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike Moin
 
Machine Learning - Deep Learning
Machine Learning - Deep LearningMachine Learning - Deep Learning
Machine Learning - Deep Learning
 
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxLesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
 
Data Analyst Interview Questions & Answers
Data Analyst Interview Questions & AnswersData Analyst Interview Questions & Answers
Data Analyst Interview Questions & Answers
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applications
 
Data Mining
Data MiningData Mining
Data Mining
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
 
For iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptxFor iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptx
 
Data Driven Engineering 2014
Data Driven Engineering 2014Data Driven Engineering 2014
Data Driven Engineering 2014
 
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
INTEGRATED ASSOCIATIVE CLASSIFICATION AND NEURAL NETWORK MODEL ENHANCED BY US...
 
Store segmentation progresso
Store segmentation progressoStore segmentation progresso
Store segmentation progresso
 

Último

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 

Último (20)

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 

Data Mining Techniques Explained

  • 1. Chapter 5Data mining : A Closer Look
  • 2. Data Warehouse and Data Mining Chapter 5 2 Chapter Objectives Determine an appropriate data mining strategy for a specific problem. Know about several data mining techniques and how each technique builds a generalized model to represent data. Understand how a confusion matrix is used to help evaluate supervised learner models.
  • 3. Data Warehouse and Data Mining Chapter 5 3 Understand basic techniques for evaluating supervised learner models with numeric output. Know how measuring lift can be used to compare the performance of several competing supervised learner models. Understand basic techniques for evaluating unsupervised learner models. Chapter Objectives
  • 4. Data Warehouse and Data Mining Chapter 5 4 Data Mining StrategiesClassificationis probably the best understood of all data mining strategies. Classification tasks have three common characteristics. •Learning is supervised. •The dependent variable is categorical. •The emphasis is on building modelsable to assign new instances to one of a set of well- defined classes.
  • 5. Data Warehouse and Data Mining Chapter 5 5 Data Mining Strategies•Some example classification tasks include the following: •Determine those characteristics that differentiate individuals who have suffered a heart attack from those who have not. •Develop a profile of a “successful” person. •Determine if a credit card purchase is fraudulent. •Classify a car loan applicant as a good or a poor credit risk. •Develop a profile to differentiate female and male stroke victims.
  • 6. Data Warehouse and Data Mining Chapter 5 6 Data Mining Strategies
  • 7. Data Warehouse and Data Mining Chapter 5 7 Data Mining Strategies
  • 8. Data Warehouse and Data Mining Chapter 5 8 Data Mining Strategies
  • 9. Data Warehouse and Data Mining Chapter 5 9 Data Mining Strategies
  • 10. Data Warehouse and Data Mining Chapter 5 10 Data Mining Strategies 34% are healthy within these max heart rate range
  • 11. Data Warehouse and Data Mining Chapter 5 11 Supervised Data Mining Techniques
  • 12. Data Warehouse and Data Mining Chapter 5 12 Supervised Data Mining Techniques
  • 13. Data Warehouse and Data Mining Chapter 5 13 Supervised Data Mining Techniques
  • 14. Data Warehouse and Data Mining Chapter 5 14 Supervised Data Mining Techniques
  • 15. Data Warehouse and Data Mining Chapter 5 15 Supervised Data Mining Techniques
  • 16. Data Warehouse and Data Mining Chapter 5 16 Association Rules
  • 17. Data Warehouse and Data Mining Chapter 5 17 Clustering Techniques
  • 18. Data Warehouse and Data Mining Chapter 5 18 Clustering Techniques
  • 19. Data Warehouse and Data Mining Chapter 5 19 Evaluating Performance
  • 20. Data Warehouse and Data Mining Chapter 5 20 Evaluating Performance
  • 21. Data Warehouse and Data Mining Chapter 5 21 Evaluating Performance
  • 22. Data Warehouse and Data Mining Chapter 5 22 Evaluating Performance
  • 23. Data Warehouse and Data Mining Chapter 5 23 Evaluating Performance
  • 24. Data Warehouse and Data Mining Chapter 5 24 Chapter SummaryData mining strategies include classification, estimation, prediction, unsupervised clustering, and market basket analysis. Classification and estimation strategies are similar in that each strategy is employed to build models able to generalize current outcome. However, the output of a classification strategy is categorical, whereas the output of an estimation strategy is numeric.
  • 25. Data Warehouse and Data Mining Chapter 5 25 Chapter SummaryA predictive strategydiffers from a classification or estimation strategy in that it is used to design models for predicting future outcome rather than current behavior. Unsupervised clusteringstrategies are employed to discover hidden concept structures in data as well as to locate atypical data instances. The purpose of market basket analysisis to find interesting relationships among retail products. Discovered relationships can be used to design promotions, arrange shelf or catalog items, or develop cross- marketing strategies.
  • 26. Data Warehouse and Data Mining Chapter 5 26 A data mining technique applies a data mining strategy to a set of data. Data mining techniques are defined by an algorithm and a knowledge structure. Common features that distinguish the various techniques are whether learning is supervised or unsupervised and whether theiroutput is categorical or numeric. Chapter Summary
  • 27. Data Warehouse and Data Mining Chapter 5 27 Familiar supervised data miningtechniques include decision tree methods, production rule generators, neural networks, and statistical methods. Association rules are a favorite technique for marketing applications. Clustering techniques employ some measure of similarity to group instancesinto disjoint partitions. Clustering methods are frequently used to help determine a best set of input attributes for building supervised learner models. Chapter Summary
  • 28. Data Warehouse and Data Mining Chapter 5 28 Chapter SummaryPerformance evaluationis probably the most critical of all the steps in the data mining process. Supervised model evaluation is often performed using a training/test set scenario. Supervised models with numeric output can be evaluated by computing average absolute or average squared error differences between computed and desired outcome.
  • 29. Data Warehouse and Data Mining Chapter 5 29 Chapter SummaryMarketing applications that focus on mass mailings are interested in developing models for increasing response rates to promotions. A marketing application measures the goodness of a model by its ability to lift response rate thresholds to levels well above those achieved by naïve (mass) mailing strategies. Unsupervised models support some measure of cluster qualitythat can be used for evaluative purposes. Supervised learning can also be employed to evaluate the quality of the clusters formedby an unsupervised model.
  • 30. Data Warehouse and Data Mining Chapter 5 30 Key TermsClassification. A supervised learning strategy where the output attribute is categorical. The emphasis is on building models able to assign new instances to one of a set of well-defined classes. Association rule.A production rule whose consequent may contain multiple conditions and attribute relationships. An output attribute in one association rule can be an input attribute in other rule. Confusion matrix.A matrix used to summarize the results of a supervised classification. Entries along the main diagonal represent the total number of correct classifications. Entries other than those on the main diagonal represent classification errors.
  • 31. Data Warehouse and Data Mining Chapter 5 31 Key TermsDatamining strategy.An outline of an approach for problem solution. Data mining technique.One or more algorithms together with an associated knowledge structure. Dependent variable.A variable whose value is determined by a combination of one or more independent variables. Estimation.A supervised learning strategy where the output attribute is numeric. Emphasis is on determining current rather than future outcome.
  • 32. Data Warehouse and Data Mining Chapter 5 32 Key TermsIndependent variable.An input attribute used for building supervised or unsupervised learner models. Lift.The probability of class Cigiven a sample taken from population Pdivided by the probability of Cigiven the entire population P. Lift chart.A graph that displays the performance of a data mining model as a function of sample size. Linear regression.A supervised learning technique that generalizes numeric data as a linear equation. The equation defines the value of an output attribute as a linear sum of weighted input attribute values.
  • 33. Data Warehouse and Data Mining Chapter 5 33 Key TermsMarket basket analysis.A data mining strategy that attempts to find interesting relationships among retail products. Mean absolute error.For a set of training or test set instances, the mean absolute error is the average absolute difference between classifier predicted output and actual output. Mean squared error.For a set of training or test set instances, the mean squared error is the average of the sum of squared differences between classifier predicted output and actual output. Neural network.A set of interconnected nodes designed to imitate the functioning of the human brain.
  • 34. Data Warehouse and Data Mining Chapter 5 34 Key TermsOutliers.Atypical data instances. Prediction.A supervised learning strategy designed to determine future outcome. Root mean squared error.The square root of the mean squared error. Rule Maker.A supervised learner model for generating production rules from data. Statistical regression.A supervised learning technique that generalizes numerical data as a mathematical equation. The equation defines the value of an output attribute as a sum of weighted input attribute values.
  • 35. Data Warehouse and Data Mining Chapter 5 35