SlideShare a Scribd company logo
1 of 31
Presented to :
Dr. Rabie
By :
Amr Abd EL Latief Abd El Al
Data Mining Def.
 Def. :
 Data mining is the extraction of interesting patterns or
knowledge from huge amount of data.
Known different names :
 knowledge discovery (mining) in databases (KDD)
 knowledge extraction,
 data/pattern analysis,
 data archeology,
 data dredging,
 information harvesting,
 business intelligence and others. [1]
What is Data Mining
 Data Mining enables data exploration, data analysis,
and data visualization of huge databases at a high level
of abstraction, without a specific hypothesis in mind.
 working of data mining is understood by using a
method called modeling with it to make predictions.
Data Mining Technologies
 include :
 artificial neural networks
 decision trees
 genetic algorithms.
 Machine Learning .
 Evolutionary Computing
 MOEA Multi objective Evolutionary
Computing
Data Mining System Arch.
Data Mining Procedure
The Process of Data Mining
Classifications
Data Types
Application
Data Types
Data Structure
Functionality
Data Types Application S.V.
 Business transactions
 Scientific data
 Medical and personal data
 Surveillance video and pictures
 Satellite sensing
 Text reports and memos (e-mail messages)
 Most of the communications
 The World Wide Web repositories
types of data (Data Structure S.V.)
 Flat files
 Relational Databases
 Data Warehouses
 Transaction Databases
 Multimedia Databases
 Spatial Databases
 World Wide Web
FUNCTIONALITIES AND
CLASSIFICATIONS OF
DATA MINING
 Characterization
 Discrimination
 Association analysis
 Classification
 uses given class labels to order the objects in
 the data collection Classification approaches normally use a
 training set where all objects are already associated with
 known class labels. The classification algorithm learns from
 the training set and builds a model. The model is used to
 classify new objects.
 Prediction
 Prediction
Data Mining Systems
specialized
data source mined
dataClassification
according to the data
drawn on modmodel
el drawn on
kind of knowledge
discovered
mining techniques
used
comprehensive
Classification according to the type
of data source mined
 This classification categorizes data mining systems
according to the type of data handled:
 spatial data
 multimedia data
 time-series data
 text data
 World Wide Web.
Classification according to the data
model drawn on
 This classification categorizes data mining systems
based on the data model involved:
 Relational database
 object-oriented database
 data warehouse
 Transactional
 others
Classification according to the king
of knowledge discovered
 This classification categorizes data mining systems
based on the kind of knowledge discovered or data
mining functionalities:
 Characterization
 discrimination
 Association
 classification
 clustering
 others
Classification according to mining
techniques used
 The classification categorizes data mining systems
according to the data analysis approach used:
 machine learning
 neural networks
 Genetic algorithms
 Statistics
 visualization
 database oriented
 data warehouse-oriented
 others
take into account the degree of
user interaction involved in the
data mining process
 query-driven systems,
 interactive exploratory systems
 autonomous systems
Note:
 A comprehensive system would provide a wide variety
of data mining techniques to fit different situations
and options, and offer different degrees of user
interaction.
[2]
Papers
Data Mining Goals
 the two main goals of DM are:
 description
 prediction.
 Standard tasks in the field of DM are: description,
clustering, association discovery, sequential pattern
analysis, classification and regression.
 Description : can be obtained by characterization or by
discrimination.
 Characterization: is a summarization of the general features
 Discrimination :does not differ too much from
characterization. It consists of characterizing a class by
comparison with another one.
Data Mining Goals
 Clustering differs from classification since it analyses data
objects without knowing their class.
 Association : discovery results in a set of association rules
which represents attribute-value conditions frequently
occurring in a given set of data.
 Sequential pattern analysis : consists in searching for
frequently occurring patterns related to time.
 Regression : uses existing values of some variables in order
to forecast what values of another continuous variable will
be
Machine Learning
 A ML system uses an entire finite set of objects,
examples which represent observations of the
environment ; the learning algorithm learns a model
from this set which is called the training set.
 ML In DM include:
 databases
 data warehouses
 flat files
Classification in DM
 Classification:
is a form of data analysis that can be used to extract
models describing important classes or to predict future
trends.
 It represents :
learning paradigm which consists in segmenting data by
assigning it to groups, or classes,, that are already defined.
 the assumption is a small database size but In Data Mining
it must be scalable technique.
Classification in DM
 classes are represented by:
the values of a particular attribute called goal attribute
and remaining attributes are called predicting
attribute.
 resulting model is usually represented as:
a set of IF-THEN prediction rules where each one
predicts a class from the predicting attributes.
ML in Classification
 Procedure:
 Algorithms are first applied to the so-called training set
which contains training examples with a known class to
discover rules.
 the model is used for classification on a set of examples,
called the test set.
 The predictive accuracy of the model is evaluated on the
test set
Classification Methods
 Main classification methods are:
 decision tree induction
 Scalability problem
 Bayesian classification
 neural network learning.
 Draw Backs:
 Time-consuming
 difficulty for humans to interpret their results.
ASSOCIATION ANALYSIS
 They show relationships between attributes. Their
typical application domain is market basket and
transaction data analysis.
 Association Rules:
 An association rule is generally defined as an expression
 X=>Y,
 where X and Y are sets of attribute-value terms
ASSOCIATION ANALYSIS
 Rules are not supposed to be strictly correct in order
for them to be useful. It is generally required to find
rules which are true to some degree only.
 X implies Y
 X tends to imply Y
 Support and confidence
Apriori Algorithm
 Depends on Frqeuent occurence
 Draw Backs :
 Large number of database scans
 Large size of generated intermediate sets.
 Apriori mining only Boolean and single-dimensional
association rules.
 These rules are adapted to market basket analysis and can
GA Advantages in Data Mining
 DM problem needs: robustness of solutions and
scalability
 GA Advantages:
 there is high ability to find patterns in vey large spaces.
 parallel implementation
 It performs a kind Of global search rather than local
hill-climbing.
 the patterns produced are directly understandable
Search Challenges
 scalability problems is an important research
challenge too.
 MULTI-OBJECTIVE RULE EXTRACTION
 MOEA Issues
Aperior Ex.

More Related Content

What's hot

What's hot (20)

Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Data Mining: What is Data Mining?
Data Mining: What is Data Mining?
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture Notes
 
Data mining
Data miningData mining
Data mining
 
Knowledge discovery thru data mining
Knowledge discovery thru data miningKnowledge discovery thru data mining
Knowledge discovery thru data mining
 
Knowledge discovery process
Knowledge discovery process Knowledge discovery process
Knowledge discovery process
 
Kdd process
Kdd processKdd process
Kdd process
 
web mining
web miningweb mining
web mining
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Clustering
ClusteringClustering
Clustering
 
Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)
 
Data mining
Data mining Data mining
Data mining
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 
01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
 
Data Mining: Association Rules Basics
Data Mining: Association Rules BasicsData Mining: Association Rules Basics
Data Mining: Association Rules Basics
 
Data mining: Classification and prediction
Data mining: Classification and predictionData mining: Classification and prediction
Data mining: Classification and prediction
 
Text MIning
Text MIningText MIning
Text MIning
 
Data mining
Data miningData mining
Data mining
 

Viewers also liked

Libro l4
Libro l4Libro l4
Libro l4Tono582
 
Certificate_35
Certificate_35Certificate_35
Certificate_35Adeel Khan
 
Manger-et-penser-bio
Manger-et-penser-bioManger-et-penser-bio
Manger-et-penser-bioAude Debenest
 
Higado y vias biliares
Higado y vias biliaresHigado y vias biliares
Higado y vias biliaresdanyalara
 
Traitement et Exploitation des nuages de points (LiDAR)
Traitement et Exploitation des nuages de points (LiDAR)Traitement et Exploitation des nuages de points (LiDAR)
Traitement et Exploitation des nuages de points (LiDAR)Mourad Labguira
 
Determinantes
DeterminantesDeterminantes
Determinantestejinha
 
LiDAR et traces agraires fossiles autour de Besançon : potentiel et limites d...
LiDAR et traces agraires fossiles autour de Besançon : potentiel et limites d...LiDAR et traces agraires fossiles autour de Besançon : potentiel et limites d...
LiDAR et traces agraires fossiles autour de Besançon : potentiel et limites d...Nicolas Bernigaud
 
What, Why & How of Crowdfunding
What, Why & How of CrowdfundingWhat, Why & How of Crowdfunding
What, Why & How of CrowdfundingMakeitnow
 
Golden hollywood
Golden hollywoodGolden hollywood
Golden hollywoodMs Olive
 
3. synergy and convergence
3. synergy and convergence3. synergy and convergence
3. synergy and convergenceMs Olive
 
Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...
Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...
Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...Solocal Group UK
 
Northern Illinois Rockford Heart Walk Slated for May of 2015
Northern Illinois Rockford Heart Walk Slated for May of 2015 Northern Illinois Rockford Heart Walk Slated for May of 2015
Northern Illinois Rockford Heart Walk Slated for May of 2015 Dr . Randy David Hassen
 

Viewers also liked (19)

Libro l4
Libro l4Libro l4
Libro l4
 
Certificate_35
Certificate_35Certificate_35
Certificate_35
 
La amistad
La amistadLa amistad
La amistad
 
Manger-et-penser-bio
Manger-et-penser-bioManger-et-penser-bio
Manger-et-penser-bio
 
Jun 06 jorge tuto quiroga - oea - reeleccion evo
Jun 06   jorge tuto quiroga - oea - reeleccion evoJun 06   jorge tuto quiroga - oea - reeleccion evo
Jun 06 jorge tuto quiroga - oea - reeleccion evo
 
Qualification of the NDI process
Qualification of the NDI processQualification of the NDI process
Qualification of the NDI process
 
Sorteo alianzas
Sorteo alianzasSorteo alianzas
Sorteo alianzas
 
Higado y vias biliares
Higado y vias biliaresHigado y vias biliares
Higado y vias biliares
 
Traitement et Exploitation des nuages de points (LiDAR)
Traitement et Exploitation des nuages de points (LiDAR)Traitement et Exploitation des nuages de points (LiDAR)
Traitement et Exploitation des nuages de points (LiDAR)
 
Determinantes
DeterminantesDeterminantes
Determinantes
 
Letra t t
Letra t tLetra t t
Letra t t
 
Raising Tomatoes Workshop
Raising Tomatoes WorkshopRaising Tomatoes Workshop
Raising Tomatoes Workshop
 
LiDAR et traces agraires fossiles autour de Besançon : potentiel et limites d...
LiDAR et traces agraires fossiles autour de Besançon : potentiel et limites d...LiDAR et traces agraires fossiles autour de Besançon : potentiel et limites d...
LiDAR et traces agraires fossiles autour de Besançon : potentiel et limites d...
 
ESTUDIOS DE VELOCIDADES EN CARRETERAS
ESTUDIOS DE VELOCIDADES EN CARRETERASESTUDIOS DE VELOCIDADES EN CARRETERAS
ESTUDIOS DE VELOCIDADES EN CARRETERAS
 
What, Why & How of Crowdfunding
What, Why & How of CrowdfundingWhat, Why & How of Crowdfunding
What, Why & How of Crowdfunding
 
Golden hollywood
Golden hollywoodGolden hollywood
Golden hollywood
 
3. synergy and convergence
3. synergy and convergence3. synergy and convergence
3. synergy and convergence
 
Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...
Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...
Jamie A Cowan, Timendo - Solocal Group UK Event "How To Drive Online Traffic ...
 
Northern Illinois Rockford Heart Walk Slated for May of 2015
Northern Illinois Rockford Heart Walk Slated for May of 2015 Northern Illinois Rockford Heart Walk Slated for May of 2015
Northern Illinois Rockford Heart Walk Slated for May of 2015
 

Similar to Data mining concepts and work

Data Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysisData Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysisDatamining Tools
 
Data Mining System and Applications: A Review
Data Mining System and Applications: A ReviewData Mining System and Applications: A Review
Data Mining System and Applications: A Reviewijdpsjournal
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.pptSamPrem3
 
Knowledge Discovery & Representation
Knowledge Discovery & RepresentationKnowledge Discovery & Representation
Knowledge Discovery & RepresentationDarshan Patil
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data miningeSAT Journals
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data miningeSAT Publishing House
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.pptPalaniKumarR2
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introductionDr-Dipali Meher
 
Data Mining Classification Comparison (Naïve Bayes and C4.5 Algorithms)
Data Mining Classification Comparison (Naïve Bayes and C4.5 Algorithms)Data Mining Classification Comparison (Naïve Bayes and C4.5 Algorithms)
Data Mining Classification Comparison (Naïve Bayes and C4.5 Algorithms)Universitas Pembangunan Panca Budi
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data miningUjjawal
 
Introduction to feature subset selection method
Introduction to feature subset selection methodIntroduction to feature subset selection method
Introduction to feature subset selection methodIJSRD
 

Similar to Data mining concepts and work (20)

Seminar Presentation
Seminar PresentationSeminar Presentation
Seminar Presentation
 
Talk
TalkTalk
Talk
 
Data Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysisData Mining: Data mining classification and analysis
Data Mining: Data mining classification and analysis
 
data mining
data miningdata mining
data mining
 
Part1
Part1Part1
Part1
 
Unit i
Unit iUnit i
Unit i
 
G045033841
G045033841G045033841
G045033841
 
Data mining
Data miningData mining
Data mining
 
Data Mining System and Applications: A Review
Data Mining System and Applications: A ReviewData Mining System and Applications: A Review
Data Mining System and Applications: A Review
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
 
Knowledge Discovery & Representation
Knowledge Discovery & RepresentationKnowledge Discovery & Representation
Knowledge Discovery & Representation
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data mining
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data mining
 
20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt20IT501_DWDM_PPT_Unit_II.ppt
20IT501_DWDM_PPT_Unit_II.ppt
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introduction
 
Data Mining Classification Comparison (Naïve Bayes and C4.5 Algorithms)
Data Mining Classification Comparison (Naïve Bayes and C4.5 Algorithms)Data Mining Classification Comparison (Naïve Bayes and C4.5 Algorithms)
Data Mining Classification Comparison (Naïve Bayes and C4.5 Algorithms)
 
Data mining
Data miningData mining
Data mining
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data mining
 
Data Mining and Knowledge
Data Mining and KnowledgeData Mining and Knowledge
Data Mining and Knowledge
 
Introduction to feature subset selection method
Introduction to feature subset selection methodIntroduction to feature subset selection method
Introduction to feature subset selection method
 

More from Amr Abd El Latief

More from Amr Abd El Latief (12)

master-journey.pptx
master-journey.pptxmaster-journey.pptx
master-journey.pptx
 
Micro frontend
Micro frontendMicro frontend
Micro frontend
 
I feel presentation [autosaved]
I feel presentation [autosaved]I feel presentation [autosaved]
I feel presentation [autosaved]
 
Design p atterns
Design p atternsDesign p atterns
Design p atterns
 
AngularJs advanced Topics
AngularJs advanced TopicsAngularJs advanced Topics
AngularJs advanced Topics
 
Angular js slides
Angular js slidesAngular js slides
Angular js slides
 
Test vector compression
Test vector compressionTest vector compression
Test vector compression
 
Designing energy efficient lte
Designing energy efficient lteDesigning energy efficient lte
Designing energy efficient lte
 
Stock market analysis using ga and neural network
Stock market analysis using ga and neural networkStock market analysis using ga and neural network
Stock market analysis using ga and neural network
 
Chromium os architecture report
Chromium os  architecture reportChromium os  architecture report
Chromium os architecture report
 
Marketing plane of cadbry bupply kids
Marketing plane of cadbry bupply kidsMarketing plane of cadbry bupply kids
Marketing plane of cadbry bupply kids
 
Test vector compression in Digital Testing
Test vector compression in Digital Testing Test vector compression in Digital Testing
Test vector compression in Digital Testing
 

Recently uploaded

Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 

Recently uploaded (20)

Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 

Data mining concepts and work

  • 1. Presented to : Dr. Rabie By : Amr Abd EL Latief Abd El Al
  • 2. Data Mining Def.  Def. :  Data mining is the extraction of interesting patterns or knowledge from huge amount of data. Known different names :  knowledge discovery (mining) in databases (KDD)  knowledge extraction,  data/pattern analysis,  data archeology,  data dredging,  information harvesting,  business intelligence and others. [1]
  • 3. What is Data Mining  Data Mining enables data exploration, data analysis, and data visualization of huge databases at a high level of abstraction, without a specific hypothesis in mind.  working of data mining is understood by using a method called modeling with it to make predictions.
  • 4. Data Mining Technologies  include :  artificial neural networks  decision trees  genetic algorithms.  Machine Learning .  Evolutionary Computing  MOEA Multi objective Evolutionary Computing
  • 7. The Process of Data Mining
  • 9. Data Types Application S.V.  Business transactions  Scientific data  Medical and personal data  Surveillance video and pictures  Satellite sensing  Text reports and memos (e-mail messages)  Most of the communications  The World Wide Web repositories
  • 10. types of data (Data Structure S.V.)  Flat files  Relational Databases  Data Warehouses  Transaction Databases  Multimedia Databases  Spatial Databases  World Wide Web
  • 11. FUNCTIONALITIES AND CLASSIFICATIONS OF DATA MINING  Characterization  Discrimination  Association analysis  Classification  uses given class labels to order the objects in  the data collection Classification approaches normally use a  training set where all objects are already associated with  known class labels. The classification algorithm learns from  the training set and builds a model. The model is used to  classify new objects.  Prediction  Prediction
  • 12. Data Mining Systems specialized data source mined dataClassification according to the data drawn on modmodel el drawn on kind of knowledge discovered mining techniques used comprehensive
  • 13. Classification according to the type of data source mined  This classification categorizes data mining systems according to the type of data handled:  spatial data  multimedia data  time-series data  text data  World Wide Web.
  • 14. Classification according to the data model drawn on  This classification categorizes data mining systems based on the data model involved:  Relational database  object-oriented database  data warehouse  Transactional  others
  • 15. Classification according to the king of knowledge discovered  This classification categorizes data mining systems based on the kind of knowledge discovered or data mining functionalities:  Characterization  discrimination  Association  classification  clustering  others
  • 16. Classification according to mining techniques used  The classification categorizes data mining systems according to the data analysis approach used:  machine learning  neural networks  Genetic algorithms  Statistics  visualization  database oriented  data warehouse-oriented  others
  • 17. take into account the degree of user interaction involved in the data mining process  query-driven systems,  interactive exploratory systems  autonomous systems Note:  A comprehensive system would provide a wide variety of data mining techniques to fit different situations and options, and offer different degrees of user interaction.
  • 19. Data Mining Goals  the two main goals of DM are:  description  prediction.  Standard tasks in the field of DM are: description, clustering, association discovery, sequential pattern analysis, classification and regression.  Description : can be obtained by characterization or by discrimination.  Characterization: is a summarization of the general features  Discrimination :does not differ too much from characterization. It consists of characterizing a class by comparison with another one.
  • 20. Data Mining Goals  Clustering differs from classification since it analyses data objects without knowing their class.  Association : discovery results in a set of association rules which represents attribute-value conditions frequently occurring in a given set of data.  Sequential pattern analysis : consists in searching for frequently occurring patterns related to time.  Regression : uses existing values of some variables in order to forecast what values of another continuous variable will be
  • 21. Machine Learning  A ML system uses an entire finite set of objects, examples which represent observations of the environment ; the learning algorithm learns a model from this set which is called the training set.  ML In DM include:  databases  data warehouses  flat files
  • 22. Classification in DM  Classification: is a form of data analysis that can be used to extract models describing important classes or to predict future trends.  It represents : learning paradigm which consists in segmenting data by assigning it to groups, or classes,, that are already defined.  the assumption is a small database size but In Data Mining it must be scalable technique.
  • 23. Classification in DM  classes are represented by: the values of a particular attribute called goal attribute and remaining attributes are called predicting attribute.  resulting model is usually represented as: a set of IF-THEN prediction rules where each one predicts a class from the predicting attributes.
  • 24. ML in Classification  Procedure:  Algorithms are first applied to the so-called training set which contains training examples with a known class to discover rules.  the model is used for classification on a set of examples, called the test set.  The predictive accuracy of the model is evaluated on the test set
  • 25. Classification Methods  Main classification methods are:  decision tree induction  Scalability problem  Bayesian classification  neural network learning.  Draw Backs:  Time-consuming  difficulty for humans to interpret their results.
  • 26. ASSOCIATION ANALYSIS  They show relationships between attributes. Their typical application domain is market basket and transaction data analysis.  Association Rules:  An association rule is generally defined as an expression  X=>Y,  where X and Y are sets of attribute-value terms
  • 27. ASSOCIATION ANALYSIS  Rules are not supposed to be strictly correct in order for them to be useful. It is generally required to find rules which are true to some degree only.  X implies Y  X tends to imply Y  Support and confidence
  • 28. Apriori Algorithm  Depends on Frqeuent occurence  Draw Backs :  Large number of database scans  Large size of generated intermediate sets.  Apriori mining only Boolean and single-dimensional association rules.  These rules are adapted to market basket analysis and can
  • 29. GA Advantages in Data Mining  DM problem needs: robustness of solutions and scalability  GA Advantages:  there is high ability to find patterns in vey large spaces.  parallel implementation  It performs a kind Of global search rather than local hill-climbing.  the patterns produced are directly understandable
  • 30. Search Challenges  scalability problems is an important research challenge too.  MULTI-OBJECTIVE RULE EXTRACTION  MOEA Issues