SlideShare uma empresa Scribd logo
1 de 17
Data Mining
Steps and Functionalities
1
Data Mining: A KDD Process
 Data mining: the core of
knowledge discovery
process.
Data Cleaning
Data Integration
Databases
Data
Warehouse
Task-relevant Data
Selection &
Transformation
Data Mining
Pattern Evaluation
2
Steps of a KDD Process
 Data Cleaning
 Handles Noisy, Inconsistent, Incomplete data
 Missing Values
 Noisy data
 Binning, Clustering etc.
 Inconsistencies
 Tools, functional dependencies
3
 Data Integration
 Schema Integration
 Entity Identification problem
 Redundancy
 Correlation Analysis
 Data Selection
 Select Only the task relevant data
Steps of a KDD Process
4
 Data Transformation
 Transform or consolidate data
 Smoothing, Normalization, Feature Construction
 Data Reduction - Compression
 Data Mining
 Intelligent methods are applied to extract patterns
Steps of a KDD Process
5
 Pattern Evaluation
 Interestingness Measures
 Knowledge Presentation
 Visualization
Steps of a KDD Process
6
Data Mining Functionalities
 Descriptive
 Characterize general properties of the data
 Predictive
 Performs inference
 Mining
 Parallel
 Various Granularities
7
Data Mining Functionalities
 Concept/class description
 Association Analysis
 Classification and Prediction
 Cluster Analysis
 Outlier Analysis
 Evolution Analysis
8
Concept/ Class Description
 Data can be associated with Classes /
Concepts
 Computers, Printers
 BigSpenders Vs BudgetSpenders
 Class / Concept Description
 Classes and Concepts can be summarized in
concise and precise terms
 Data Characterization
 Data Discrimination
9
Data Characterization
 Summarization of the general characteristics
 Data collected and aggregated
 OLAP roll up operation
 Attribute Oriented Induction
 Results – Charts, cubes, rules
 Example
 Characteristics of Customers
10
Data Discrimination
 Compare target class and contrasting classes
 Maybe user specified
 Examples:
 Products whose sales increased Vs decreased
 Regular Shoppers Vs Occasional Shoppers
 Output includes Comparative measures
11
Association Analysis
 Discovery of association rules
 Form: X ⇒ Y
 Multi-dimensional
 Age(X, “20…29”) ∧ income(X, “20K…25K”) ⇒
buys(X, “Laptop”)
 Single Dimensional
 buys(X, “Laptop”) ⇒ buys(X, “Software”)
12
Classification and Prediction
 Classification
 Finds models that describe and differentiate
classes or concepts
 Predicts class
 Training data
 Models – rules, decision trees, NN, formulae
 Preceded by relevance analysis (to eliminate
irrelevant attributes)
13
Classification and Prediction
 Prediction
 Derived model is used for prediction
 Data value prediction
 Class label prediction (Classification)
 Trend identification
14
Cluster Analysis
 Unsupervised
 Class labels are missing in the training set
 Maximize Intra-class similarity
 Minimize Inter-class similarity
 Hierarchy of classes
15
Outlier Analysis
 Objects that do not comply with the general
behavior
 Noise Vs Rare events
 Fraud detection
 Statistical tests
 Deviation based methods
16
Evolution Analysis
 Trend detection
 Time series data
 Involves other functionalities
17

Mais conteúdo relacionado

Mais procurados

Structure of the page table
Structure of the page tableStructure of the page table
Structure of the page tableduvvuru madhuri
 
Chapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organizationChapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organizationJafar Nesargi
 
Object database standards, languages and design
Object database standards, languages and designObject database standards, languages and design
Object database standards, languages and designDabbal Singh Mahara
 
Distributed file system
Distributed file systemDistributed file system
Distributed file systemAnamika Singh
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olapSalah Amean
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesDr. C.V. Suresh Babu
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataSalah Amean
 
3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clusteringKrish_ver2
 
Knowledge representation In Artificial Intelligence
Knowledge representation In Artificial IntelligenceKnowledge representation In Artificial Intelligence
Knowledge representation In Artificial IntelligenceRamla Sheikh
 
Polygon clipping with sutherland hodgeman algorithm and scan line fill algorithm
Polygon clipping with sutherland hodgeman algorithm and scan line fill algorithmPolygon clipping with sutherland hodgeman algorithm and scan line fill algorithm
Polygon clipping with sutherland hodgeman algorithm and scan line fill algorithmMani Kanth
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classificationKrish_ver2
 
Distributed Database System
Distributed Database SystemDistributed Database System
Distributed Database SystemSulemang
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
Unsupervised learning
Unsupervised learningUnsupervised learning
Unsupervised learningamalalhait
 

Mais procurados (20)

RAID LEVELS
RAID LEVELSRAID LEVELS
RAID LEVELS
 
Structure of the page table
Structure of the page tableStructure of the page table
Structure of the page table
 
Data mining tasks
Data mining tasksData mining tasks
Data mining tasks
 
Chapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organizationChapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organization
 
Object database standards, languages and design
Object database standards, languages and designObject database standards, languages and design
Object database standards, languages and design
 
Distributed file system
Distributed file systemDistributed file system
Distributed file system
 
RAID
RAIDRAID
RAID
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
 
Data Mining
Data MiningData Mining
Data Mining
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching Techniques
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, data
 
3.5 model based clustering
3.5 model based clustering3.5 model based clustering
3.5 model based clustering
 
Knowledge representation In Artificial Intelligence
Knowledge representation In Artificial IntelligenceKnowledge representation In Artificial Intelligence
Knowledge representation In Artificial Intelligence
 
Polygon clipping with sutherland hodgeman algorithm and scan line fill algorithm
Polygon clipping with sutherland hodgeman algorithm and scan line fill algorithmPolygon clipping with sutherland hodgeman algorithm and scan line fill algorithm
Polygon clipping with sutherland hodgeman algorithm and scan line fill algorithm
 
Data mining primitives
Data mining primitivesData mining primitives
Data mining primitives
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classification
 
Distributed Database System
Distributed Database SystemDistributed Database System
Distributed Database System
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
Raid and its levels
Raid and its levelsRaid and its levels
Raid and its levels
 
Unsupervised learning
Unsupervised learningUnsupervised learning
Unsupervised learning
 

Destaque

How can we train our employees about the basic concepts of Stock Market?
How can we train our employees about the basic concepts of Stock Market? How can we train our employees about the basic concepts of Stock Market?
How can we train our employees about the basic concepts of Stock Market? Enhance Systems Pvt. Ltd.
 
Vikalp Sangam (Alternatives Confluence)
Vikalp Sangam (Alternatives Confluence)Vikalp Sangam (Alternatives Confluence)
Vikalp Sangam (Alternatives Confluence)Ashish Kothari
 
Sport rabbit
Sport rabbitSport rabbit
Sport rabbitJack740
 
Evaluation Activity 3
Evaluation Activity  3Evaluation Activity  3
Evaluation Activity 3SHEKARIE
 
Data journalism e narrazioni civiche. A quali condizioni un giornalismo inve...
Data journalism e narrazioni civiche.  A quali condizioni un giornalismo inve...Data journalism e narrazioni civiche.  A quali condizioni un giornalismo inve...
Data journalism e narrazioni civiche. A quali condizioni un giornalismo inve...Rosy Battaglia
 
Container Inventory Management: Factors influencing Container Interchange
Container Inventory Management: Factors influencing Container InterchangeContainer Inventory Management: Factors influencing Container Interchange
Container Inventory Management: Factors influencing Container InterchangeCINEC Campus
 
Big Data - How to Get Started
Big Data - How to Get Started Big Data - How to Get Started
Big Data - How to Get Started Pactera_US
 
Food sovereignty: Initiatives and lessons from India
Food sovereignty: Initiatives and lessons from IndiaFood sovereignty: Initiatives and lessons from India
Food sovereignty: Initiatives and lessons from IndiaAshish Kothari
 

Destaque (15)

How can we train our employees about the basic concepts of Stock Market?
How can we train our employees about the basic concepts of Stock Market? How can we train our employees about the basic concepts of Stock Market?
How can we train our employees about the basic concepts of Stock Market?
 
CV Rupert Menezes
CV Rupert MenezesCV Rupert Menezes
CV Rupert Menezes
 
Pechacucha
PechacuchaPechacucha
Pechacucha
 
Vaidyanathan VP 05
Vaidyanathan VP 05Vaidyanathan VP 05
Vaidyanathan VP 05
 
Vikalp Sangam (Alternatives Confluence)
Vikalp Sangam (Alternatives Confluence)Vikalp Sangam (Alternatives Confluence)
Vikalp Sangam (Alternatives Confluence)
 
Sport rabbit
Sport rabbitSport rabbit
Sport rabbit
 
Evaluation Activity 3
Evaluation Activity  3Evaluation Activity  3
Evaluation Activity 3
 
Почеци словенске писмености
Почеци словенске писменостиПочеци словенске писмености
Почеци словенске писмености
 
Data journalism e narrazioni civiche. A quali condizioni un giornalismo inve...
Data journalism e narrazioni civiche.  A quali condizioni un giornalismo inve...Data journalism e narrazioni civiche.  A quali condizioni un giornalismo inve...
Data journalism e narrazioni civiche. A quali condizioni un giornalismo inve...
 
Container Inventory Management: Factors influencing Container Interchange
Container Inventory Management: Factors influencing Container InterchangeContainer Inventory Management: Factors influencing Container Interchange
Container Inventory Management: Factors influencing Container Interchange
 
Elena fortun
Elena fortunElena fortun
Elena fortun
 
Big Data - How to Get Started
Big Data - How to Get Started Big Data - How to Get Started
Big Data - How to Get Started
 
Big Data at your Desk with KNIME
Big Data at your Desk with KNIMEBig Data at your Desk with KNIME
Big Data at your Desk with KNIME
 
Ud 7 arte prerrománico
Ud 7  arte prerrománicoUd 7  arte prerrománico
Ud 7 arte prerrománico
 
Food sovereignty: Initiatives and lessons from India
Food sovereignty: Initiatives and lessons from IndiaFood sovereignty: Initiatives and lessons from India
Food sovereignty: Initiatives and lessons from India
 

Semelhante a 1.2 steps and functionalities

Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dmsumit621
 
finalestkddfinalpresentation-111207021040-phpapp01.pptx
finalestkddfinalpresentation-111207021040-phpapp01.pptxfinalestkddfinalpresentation-111207021040-phpapp01.pptx
finalestkddfinalpresentation-111207021040-phpapp01.pptxshumPanwar
 
Knowledge discovery claudiad amato
Knowledge discovery claudiad amatoKnowledge discovery claudiad amato
Knowledge discovery claudiad amatoSSSW
 
Data imputation for unstructured dataset
Data imputation for unstructured datasetData imputation for unstructured dataset
Data imputation for unstructured datasetVibhore Agarwal
 
Data Mining - The Big Picture!
Data Mining - The Big Picture!Data Mining - The Big Picture!
Data Mining - The Big Picture!Khalid Salama
 
Tutorial Knowledge Discovery
Tutorial Knowledge DiscoveryTutorial Knowledge Discovery
Tutorial Knowledge DiscoverySSSW
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data miningUjjawal
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Miningdataminers.ir
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining Phi Jack
 
Data preparation and processing chapter 2
Data preparation and processing chapter  2Data preparation and processing chapter  2
Data preparation and processing chapter 2Mahmoud Alfarra
 
Data Mining : Concepts and Techniques
Data Mining : Concepts and TechniquesData Mining : Concepts and Techniques
Data Mining : Concepts and TechniquesDeepaR42
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 abhagathk
 

Semelhante a 1.2 steps and functionalities (20)

Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dm
 
Part1
Part1Part1
Part1
 
Data mining
Data miningData mining
Data mining
 
Data mining
Data miningData mining
Data mining
 
finalestkddfinalpresentation-111207021040-phpapp01.pptx
finalestkddfinalpresentation-111207021040-phpapp01.pptxfinalestkddfinalpresentation-111207021040-phpapp01.pptx
finalestkddfinalpresentation-111207021040-phpapp01.pptx
 
Knowledge discovery claudiad amato
Knowledge discovery claudiad amatoKnowledge discovery claudiad amato
Knowledge discovery claudiad amato
 
Talk
TalkTalk
Talk
 
Data imputation for unstructured dataset
Data imputation for unstructured datasetData imputation for unstructured dataset
Data imputation for unstructured dataset
 
Data Mining - The Big Picture!
Data Mining - The Big Picture!Data Mining - The Big Picture!
Data Mining - The Big Picture!
 
Data mining
Data miningData mining
Data mining
 
Tutorial Knowledge Discovery
Tutorial Knowledge DiscoveryTutorial Knowledge Discovery
Tutorial Knowledge Discovery
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
 
Data science guide
Data science guideData science guide
Data science guide
 
Data preparation and processing chapter 2
Data preparation and processing chapter  2Data preparation and processing chapter  2
Data preparation and processing chapter 2
 
Data Mining : Concepts and Techniques
Data Mining : Concepts and TechniquesData Mining : Concepts and Techniques
Data Mining : Concepts and Techniques
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
 

Mais de Rajendran

Element distinctness lower bounds
Element distinctness lower boundsElement distinctness lower bounds
Element distinctness lower boundsRajendran
 
Scheduling with Startup and Holding Costs
Scheduling with Startup and Holding CostsScheduling with Startup and Holding Costs
Scheduling with Startup and Holding CostsRajendran
 
Divide and conquer surfing lower bounds
Divide and conquer  surfing lower boundsDivide and conquer  surfing lower bounds
Divide and conquer surfing lower boundsRajendran
 
Red black tree
Red black treeRed black tree
Red black treeRajendran
 
Medians and order statistics
Medians and order statisticsMedians and order statistics
Medians and order statisticsRajendran
 
Proof master theorem
Proof master theoremProof master theorem
Proof master theoremRajendran
 
Recursion tree method
Recursion tree methodRecursion tree method
Recursion tree methodRajendran
 
Recurrence theorem
Recurrence theoremRecurrence theorem
Recurrence theoremRajendran
 
Master method
Master method Master method
Master method Rajendran
 
Master method theorem
Master method theoremMaster method theorem
Master method theoremRajendran
 
Master method theorem
Master method theoremMaster method theorem
Master method theoremRajendran
 
Greedy algorithms
Greedy algorithmsGreedy algorithms
Greedy algorithmsRajendran
 
Longest common subsequences in Algorithm Analysis
Longest common subsequences in Algorithm AnalysisLongest common subsequences in Algorithm Analysis
Longest common subsequences in Algorithm AnalysisRajendran
 
Dynamic programming in Algorithm Analysis
Dynamic programming in Algorithm AnalysisDynamic programming in Algorithm Analysis
Dynamic programming in Algorithm AnalysisRajendran
 
Average case Analysis of Quicksort
Average case Analysis of QuicksortAverage case Analysis of Quicksort
Average case Analysis of QuicksortRajendran
 
Np completeness
Np completenessNp completeness
Np completenessRajendran
 
computer languages
computer languagescomputer languages
computer languagesRajendran
 

Mais de Rajendran (20)

Element distinctness lower bounds
Element distinctness lower boundsElement distinctness lower bounds
Element distinctness lower bounds
 
Scheduling with Startup and Holding Costs
Scheduling with Startup and Holding CostsScheduling with Startup and Holding Costs
Scheduling with Startup and Holding Costs
 
Divide and conquer surfing lower bounds
Divide and conquer  surfing lower boundsDivide and conquer  surfing lower bounds
Divide and conquer surfing lower bounds
 
Red black tree
Red black treeRed black tree
Red black tree
 
Hash table
Hash tableHash table
Hash table
 
Medians and order statistics
Medians and order statisticsMedians and order statistics
Medians and order statistics
 
Proof master theorem
Proof master theoremProof master theorem
Proof master theorem
 
Recursion tree method
Recursion tree methodRecursion tree method
Recursion tree method
 
Recurrence theorem
Recurrence theoremRecurrence theorem
Recurrence theorem
 
Master method
Master method Master method
Master method
 
Master method theorem
Master method theoremMaster method theorem
Master method theorem
 
Hash tables
Hash tablesHash tables
Hash tables
 
Lower bound
Lower boundLower bound
Lower bound
 
Master method theorem
Master method theoremMaster method theorem
Master method theorem
 
Greedy algorithms
Greedy algorithmsGreedy algorithms
Greedy algorithms
 
Longest common subsequences in Algorithm Analysis
Longest common subsequences in Algorithm AnalysisLongest common subsequences in Algorithm Analysis
Longest common subsequences in Algorithm Analysis
 
Dynamic programming in Algorithm Analysis
Dynamic programming in Algorithm AnalysisDynamic programming in Algorithm Analysis
Dynamic programming in Algorithm Analysis
 
Average case Analysis of Quicksort
Average case Analysis of QuicksortAverage case Analysis of Quicksort
Average case Analysis of Quicksort
 
Np completeness
Np completenessNp completeness
Np completeness
 
computer languages
computer languagescomputer languages
computer languages
 

Último

Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projectssmsksolar
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Call Girls Mumbai
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stageAbc194748
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARKOUSTAV SARKAR
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksMagic Marks
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueBhangaleSonal
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesMayuraD1
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxSCMS School of Architecture
 

Último (20)

Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stage
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic Marks
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 

1.2 steps and functionalities

  • 1. Data Mining Steps and Functionalities 1
  • 2. Data Mining: A KDD Process  Data mining: the core of knowledge discovery process. Data Cleaning Data Integration Databases Data Warehouse Task-relevant Data Selection & Transformation Data Mining Pattern Evaluation 2
  • 3. Steps of a KDD Process  Data Cleaning  Handles Noisy, Inconsistent, Incomplete data  Missing Values  Noisy data  Binning, Clustering etc.  Inconsistencies  Tools, functional dependencies 3
  • 4.  Data Integration  Schema Integration  Entity Identification problem  Redundancy  Correlation Analysis  Data Selection  Select Only the task relevant data Steps of a KDD Process 4
  • 5.  Data Transformation  Transform or consolidate data  Smoothing, Normalization, Feature Construction  Data Reduction - Compression  Data Mining  Intelligent methods are applied to extract patterns Steps of a KDD Process 5
  • 6.  Pattern Evaluation  Interestingness Measures  Knowledge Presentation  Visualization Steps of a KDD Process 6
  • 7. Data Mining Functionalities  Descriptive  Characterize general properties of the data  Predictive  Performs inference  Mining  Parallel  Various Granularities 7
  • 8. Data Mining Functionalities  Concept/class description  Association Analysis  Classification and Prediction  Cluster Analysis  Outlier Analysis  Evolution Analysis 8
  • 9. Concept/ Class Description  Data can be associated with Classes / Concepts  Computers, Printers  BigSpenders Vs BudgetSpenders  Class / Concept Description  Classes and Concepts can be summarized in concise and precise terms  Data Characterization  Data Discrimination 9
  • 10. Data Characterization  Summarization of the general characteristics  Data collected and aggregated  OLAP roll up operation  Attribute Oriented Induction  Results – Charts, cubes, rules  Example  Characteristics of Customers 10
  • 11. Data Discrimination  Compare target class and contrasting classes  Maybe user specified  Examples:  Products whose sales increased Vs decreased  Regular Shoppers Vs Occasional Shoppers  Output includes Comparative measures 11
  • 12. Association Analysis  Discovery of association rules  Form: X ⇒ Y  Multi-dimensional  Age(X, “20…29”) ∧ income(X, “20K…25K”) ⇒ buys(X, “Laptop”)  Single Dimensional  buys(X, “Laptop”) ⇒ buys(X, “Software”) 12
  • 13. Classification and Prediction  Classification  Finds models that describe and differentiate classes or concepts  Predicts class  Training data  Models – rules, decision trees, NN, formulae  Preceded by relevance analysis (to eliminate irrelevant attributes) 13
  • 14. Classification and Prediction  Prediction  Derived model is used for prediction  Data value prediction  Class label prediction (Classification)  Trend identification 14
  • 15. Cluster Analysis  Unsupervised  Class labels are missing in the training set  Maximize Intra-class similarity  Minimize Inter-class similarity  Hierarchy of classes 15
  • 16. Outlier Analysis  Objects that do not comply with the general behavior  Noise Vs Rare events  Fraud detection  Statistical tests  Deviation based methods 16
  • 17. Evolution Analysis  Trend detection  Time series data  Involves other functionalities 17