SlideShare uma empresa Scribd logo
1 de 13
Baixar para ler offline
Agent Based Frequent Set Meta
Mining: Introducing EMADS
Kamal Ali Albashiri, Frans Coenen, and Paul Leng
Department of Computer Science, The University of Liverpool, UK
IFIP 2008,Milano, Italy
Outline
 Data Mining Process
 Multi-Agent Systems For Data Mining
 EMADS Vision
 Meta ARM Application
 Experimentation and Results
 Conclusion and Current Work
Data Mining Process
The DM process can be summarized as follows:
1. Decide what you want to learn.
 Learn a classification model so as to predict some feature(s) of
new example data cases (Data Classification).
 Find patterns (associations) within data (Association Rule
Mining or ARM).
 Group data into clusters and then be able to add new example
data cases to the appropriate cluster according to the defining
features of the identified clusters and the new case (Data
Clustering).
2. Select and prepare your data.
 Select relevant data for the desired objective. For instance
training and test sets.
 Consolidate the data into a single data warehouse to which
mining algorithms can be applied.
 (Usually) some form of data transform (normalisation,
descretisation, etc), possibly data cleaning and so on.
Data Mining Process cont’
3. Choose and configure the mining task or tasks.
For instance, a user may wish to cluster users together that visited
similar Web pages, and then derive association rules that show how
those users and pages are related.
4. Select and configure the mining algorithms.
Many data-mining algorithms are available for a given task. Algorithms
differ not only in the potential accuracy of their end-product, but also in
the computational resources they require.
5. Build the data-mining model.
The output from executing a data-mining task. The model can be
viewed as an abstraction of the data suited with respect to the objective.
The model might be a neural network, a decision tree, or even a set of
rules understandable by humans.
6. Test and refine the models.
In some cases several models may be created, evaluated and an
enhanced (in some sense) result distilled.
7. Report findings or predict future outcomes.
Finally, report the findings and/or use the generated data-mining
models (for example to predict future example cases).
MultiAgent Systems For Data Mining
 Process automation. The current trend is towards automating as much of
the data mining process as possible. "Even those not expert in data mining
can reap the benefits of data-mining technologies," .
 Task matching. Agents can automatically match algorithms to a desired
data-mining objective; for instance, a clustering algorithm to create data
clusters, or an association-rules algorithm to identify association rules.
 Data and Task matching. Agents can automatically match mining tasks to
relevant data.
 Result evaluation. Agents can evaluate the accuracy of each model with
respect to existing (training) data, and select a "best" (fit for purpose) model.
 Privacy of sources. Individual owners of agents can preserve the privacy
and security of their raw data and only share the results of data mining
activities.
 Extendibility. Flexibility to incorporate new data mining techniques and data
sources (adding/removing agents)
 Computational efficiency. Enhance the efficiency of parallel data mining
processing.
 Scalability and Resublity. MAS will offer access to a wider pool of available
data and allow reuse to previously generated information.
EMADS can (or will be able to) provide the following:
 Enable and accelerate the deployment of practical solutions to data mining
problems.
 Provide a data mining agent space into which data mining agents of all sorts
can be launched.
 Allow anybody who is prepared to share their data to make that data
available using an appropriately defined data agent.
 Allow anybody who wishes to obtain data mining results the facility to launch
an appropriately define query agent.
 Allow anybody the facility to add a new data mining agent that can apply
some data mining algorithm to data without significant specialised
knowledge of multi-agent based data mining.
EMADS (Extendible Multi-Agent Data mining System)
Vision
EMADS Meta ARM Application
To illustrate some of the features of EMADS a Meta ARM scenario is considered in this paper.
ARM Problem Definition
 Given a database D we wish to find all the frequent itemsets (F)
and then use this knowledge to produce high confidence
association rules (ARs) of the form A→B.
 Note: Finding F is the most computationally expensive part, once
we have the frequent sets generating ARs is straight forward
TID Atts.
1
2
3
4
5
1 2
1 2 3
2 3 4
2 3 4
3 4 5
6 4 5
1
2
2
4
5
2
3
2
4
3
4
2
3
3
Support threshold = 40%
(count of 2.4) 2
2
4
Apriori Example (using a T-tree)
Frequent Set Meta Mining
 Meta ARM: A given ARM algorithm is applied to N raw data sets
producing N collections of frequent item sets.
 The objective is then to merge the different sets of results into a single
meta set of frequent itemsets with the aim of generating a set of
Association Rules (ARs).
 Key issue: wherever an itemset is frequent in a data source A but not in
a data source B a check for any contribution from data source B is
required (to obtain a total support count).
Meta ARM approach
Data Mining
Algorithm
Data Mining
Algorithm
Data Mining
Algorithm
T-Tree T-Tree
T-Tree
Meta ARM Algorithm Merged T-Tree
Legend
Data Flow
RTD
1 2 N
Meta ARM algorithms
 Issue in meta ARM is how best to combine the results from N
different sources, in the most computationally efficient manner, into
a single meta set of itemsets?
 Meta ARM algorithms: We can identify a number of different strategies of
combining the individually obtained T-Trees of N datasets into one global T-
Tree making use of Return To Data lists (RTD) to obtain additional counts.
1. Brute Force: Merge T-trees one by one generating (N) RTD lists,
pruning the T-tree at end of the merge process.
2. Apriori: Merge all T-trees level by level generating (K*N) RTD lists,
pruning the T-tree at each level (K = number of levels).
3. Hybrid 1: Merge top level in the Apriori manner and the rest in BF
manner.
4. Hybrid 2: Merge top two levels in the Apriori manner and the rest in BF
manner.
5. A “bench mark” system is described to allow for appropriate
comparison.
Meta ARM algorithms Example
 Brute Force Meta ARM

 merge
 inclusion of additional counts (RTD)
 final prune.
2 4
2
3 2
2
2 3
2 4
2 4
1 2
1
2 3
2
1 2
1 2
2 4
2
1
6
2
BF RTD Lists
{1}
{4}
DS2
DS1
Apriori RTD
Lists (level 1)
Dataset 2
(3 X 4)
T-Tree T-Tree
Merged T-Tree
null
{2,4}
DS2
DS1
RTD Lists
(level 2)
Example: relative support =50% (3 records)
Generated
rules
2 → 4 (50%)
4 → 2 (100%)
1
2
4
2
2
2
• Apriori Meta ARM
1. K = 1
2. Generate candidate K-itemsets
3. Add supports for level K from N T-
trees OR add to RTD list
4. Get additional support
5. Prune K-itemsets
6. K = K+1
7. Generate K-itemsets
8. End Loop
relative support =50% (1.5 records count)
Dataset 1
(3 X 4)
3
3
3
2
{1,2}
{2,4}
{1}
{4}
DS2
DS1
Experimentation and Results
1. The number of data sources (data agents running on different
machines )
- Each dataset has 100,000 transactions.
- Support Threshold = 1%
 Measured:
 processing time,
 the size of the RTD lists (messages’ size)
 and the number of messages.
0
0.5
1
1.5
2 3 4 5 6 7 8 9 10
Number of Data Sets
Time
(sec)
Apriori
Bench Mark
Hybrid & BF
0
50
100
150
200
2 3 4 5 6 7 8 9 10
Number of Data Sets
Message
Size
(Kbyte)
Apriori
Bench Mark
Hybrid & BF
0
10
20
30
40
50
2 3 4 5 6 7 8 9 10
Number of Data Sets
Number
Generated
RTD
Lists
Apriori
Brute Force
Hybrid 1
Hybrid 2
Bench Mark
Conclusion and Current Work
 Conclusion
 Multi-Agent Systems (MAS) can offer a well suited architecture
for data mining in a decentralised, and anarchic manner so that
much greater benefit can be obtained from current data mining
capabilities and available data.
 We described a extension of ARM where we build a meta set of
frequent itemsets from a collection of component sets which
have been generated in an autonomous manner without
centralised control in a the EMADS environment.
 Overall the Hybrid 2 algorithm is the best.
 Current Work
 Classification application
 Data normalization (data pre-processing) agents
 Wrapper agents: agents that can be used to incorporate new
mining algorithms into EMADS without significant specialised
knowledge of the system.
Questions?

Mais conteúdo relacionado

Semelhante a ifip2008albashiri.pdf

Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...Spark Summit
 
Lesson 2 data preprocessing
Lesson 2   data preprocessingLesson 2   data preprocessing
Lesson 2 data preprocessingAbdurRazzaqe1
 
Simplified Data Processing On Large Cluster
Simplified Data Processing On Large ClusterSimplified Data Processing On Large Cluster
Simplified Data Processing On Large ClusterHarsh Kevadia
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETEditor IJMTER
 
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351IRJET Journal
 
A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...
A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...
A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...ijcsa
 
SELF LEARNING REAL TIME EXPERT SYSTEM
SELF LEARNING REAL TIME EXPERT SYSTEMSELF LEARNING REAL TIME EXPERT SYSTEM
SELF LEARNING REAL TIME EXPERT SYSTEMcscpconf
 
IRJET- Empower Syntactic Exploration Based on Conceptual Graph using Searchab...
IRJET- Empower Syntactic Exploration Based on Conceptual Graph using Searchab...IRJET- Empower Syntactic Exploration Based on Conceptual Graph using Searchab...
IRJET- Empower Syntactic Exploration Based on Conceptual Graph using Searchab...IRJET Journal
 
DMDW Lesson 05 + 06 + 07 - Data Mining Applied
DMDW Lesson 05 + 06 + 07 - Data Mining AppliedDMDW Lesson 05 + 06 + 07 - Data Mining Applied
DMDW Lesson 05 + 06 + 07 - Data Mining AppliedJohannes Hoppe
 
IRJET- Top-K Query Processing using Top Order Preserving Encryption (TOPE)
IRJET- Top-K Query Processing using Top Order Preserving Encryption (TOPE)IRJET- Top-K Query Processing using Top Order Preserving Encryption (TOPE)
IRJET- Top-K Query Processing using Top Order Preserving Encryption (TOPE)IRJET Journal
 
DEVELOPING A NOVEL MULTIDIMENSIONAL MULTIGRANULARITY DATA MINING APPROACH FOR...
DEVELOPING A NOVEL MULTIDIMENSIONAL MULTIGRANULARITY DATA MINING APPROACH FOR...DEVELOPING A NOVEL MULTIDIMENSIONAL MULTIGRANULARITY DATA MINING APPROACH FOR...
DEVELOPING A NOVEL MULTIDIMENSIONAL MULTIGRANULARITY DATA MINING APPROACH FOR...cscpconf
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
Machine Learning as service
Machine Learning as serviceMachine Learning as service
Machine Learning as serviceNihal Mehdi
 
Scalable frequent itemset mining using heterogeneous computing par apriori a...
Scalable frequent itemset mining using heterogeneous computing  par apriori a...Scalable frequent itemset mining using heterogeneous computing  par apriori a...
Scalable frequent itemset mining using heterogeneous computing par apriori a...ijdpsjournal
 
A Performance Based Transposition algorithm for Frequent Itemsets Generation
A Performance Based Transposition algorithm for Frequent Itemsets GenerationA Performance Based Transposition algorithm for Frequent Itemsets Generation
A Performance Based Transposition algorithm for Frequent Itemsets GenerationWaqas Tariq
 
Summarizing Software API Usage Examples Using Clustering Techniques
Summarizing Software API Usage Examples Using Clustering TechniquesSummarizing Software API Usage Examples Using Clustering Techniques
Summarizing Software API Usage Examples Using Clustering TechniquesNikos Katirtzis
 
BI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business businessBI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business businessJawaherAlbaddawi
 

Semelhante a ifip2008albashiri.pdf (20)

Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
Practical Large Scale Experiences with Spark 2.0 Machine Learning: Spark Summ...
 
Lesson 2 data preprocessing
Lesson 2   data preprocessingLesson 2   data preprocessing
Lesson 2 data preprocessing
 
Simplified Data Processing On Large Cluster
Simplified Data Processing On Large ClusterSimplified Data Processing On Large Cluster
Simplified Data Processing On Large Cluster
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
 
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
 
A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...
A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...
A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...
 
SELF LEARNING REAL TIME EXPERT SYSTEM
SELF LEARNING REAL TIME EXPERT SYSTEMSELF LEARNING REAL TIME EXPERT SYSTEM
SELF LEARNING REAL TIME EXPERT SYSTEM
 
IRJET- Empower Syntactic Exploration Based on Conceptual Graph using Searchab...
IRJET- Empower Syntactic Exploration Based on Conceptual Graph using Searchab...IRJET- Empower Syntactic Exploration Based on Conceptual Graph using Searchab...
IRJET- Empower Syntactic Exploration Based on Conceptual Graph using Searchab...
 
DMDW Lesson 05 + 06 + 07 - Data Mining Applied
DMDW Lesson 05 + 06 + 07 - Data Mining AppliedDMDW Lesson 05 + 06 + 07 - Data Mining Applied
DMDW Lesson 05 + 06 + 07 - Data Mining Applied
 
IRJET- Top-K Query Processing using Top Order Preserving Encryption (TOPE)
IRJET- Top-K Query Processing using Top Order Preserving Encryption (TOPE)IRJET- Top-K Query Processing using Top Order Preserving Encryption (TOPE)
IRJET- Top-K Query Processing using Top Order Preserving Encryption (TOPE)
 
DEVELOPING A NOVEL MULTIDIMENSIONAL MULTIGRANULARITY DATA MINING APPROACH FOR...
DEVELOPING A NOVEL MULTIDIMENSIONAL MULTIGRANULARITY DATA MINING APPROACH FOR...DEVELOPING A NOVEL MULTIDIMENSIONAL MULTIGRANULARITY DATA MINING APPROACH FOR...
DEVELOPING A NOVEL MULTIDIMENSIONAL MULTIGRANULARITY DATA MINING APPROACH FOR...
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
Machine Learning as service
Machine Learning as serviceMachine Learning as service
Machine Learning as service
 
EXECUTION OF ASSOCIATION RULE MINING WITH DATA GRIDS IN WEKA 3.8
EXECUTION OF ASSOCIATION RULE MINING WITH DATA GRIDS IN WEKA 3.8EXECUTION OF ASSOCIATION RULE MINING WITH DATA GRIDS IN WEKA 3.8
EXECUTION OF ASSOCIATION RULE MINING WITH DATA GRIDS IN WEKA 3.8
 
Lec1
Lec1Lec1
Lec1
 
Lec1
Lec1Lec1
Lec1
 
Scalable frequent itemset mining using heterogeneous computing par apriori a...
Scalable frequent itemset mining using heterogeneous computing  par apriori a...Scalable frequent itemset mining using heterogeneous computing  par apriori a...
Scalable frequent itemset mining using heterogeneous computing par apriori a...
 
A Performance Based Transposition algorithm for Frequent Itemsets Generation
A Performance Based Transposition algorithm for Frequent Itemsets GenerationA Performance Based Transposition algorithm for Frequent Itemsets Generation
A Performance Based Transposition algorithm for Frequent Itemsets Generation
 
Summarizing Software API Usage Examples Using Clustering Techniques
Summarizing Software API Usage Examples Using Clustering TechniquesSummarizing Software API Usage Examples Using Clustering Techniques
Summarizing Software API Usage Examples Using Clustering Techniques
 
BI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business businessBI Chapter 04.pdf business business business business
BI Chapter 04.pdf business business business business
 

Último

Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...HyderabadDolls
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...vershagrag
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...
Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...
Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...Delhi Call girls
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 
👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...
👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...
👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...vershagrag
 

Último (20)

Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...
Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...
Oral Sex Call Girls Kashmiri Gate Delhi Just Call 👉👉 📞 8448380779 Top Class C...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Nandurbar [ 7014168258 ] Call Me For Genuine Models...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...
👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...
👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...
 

ifip2008albashiri.pdf

  • 1. Agent Based Frequent Set Meta Mining: Introducing EMADS Kamal Ali Albashiri, Frans Coenen, and Paul Leng Department of Computer Science, The University of Liverpool, UK IFIP 2008,Milano, Italy
  • 2. Outline  Data Mining Process  Multi-Agent Systems For Data Mining  EMADS Vision  Meta ARM Application  Experimentation and Results  Conclusion and Current Work
  • 3. Data Mining Process The DM process can be summarized as follows: 1. Decide what you want to learn.  Learn a classification model so as to predict some feature(s) of new example data cases (Data Classification).  Find patterns (associations) within data (Association Rule Mining or ARM).  Group data into clusters and then be able to add new example data cases to the appropriate cluster according to the defining features of the identified clusters and the new case (Data Clustering). 2. Select and prepare your data.  Select relevant data for the desired objective. For instance training and test sets.  Consolidate the data into a single data warehouse to which mining algorithms can be applied.  (Usually) some form of data transform (normalisation, descretisation, etc), possibly data cleaning and so on.
  • 4. Data Mining Process cont’ 3. Choose and configure the mining task or tasks. For instance, a user may wish to cluster users together that visited similar Web pages, and then derive association rules that show how those users and pages are related. 4. Select and configure the mining algorithms. Many data-mining algorithms are available for a given task. Algorithms differ not only in the potential accuracy of their end-product, but also in the computational resources they require. 5. Build the data-mining model. The output from executing a data-mining task. The model can be viewed as an abstraction of the data suited with respect to the objective. The model might be a neural network, a decision tree, or even a set of rules understandable by humans. 6. Test and refine the models. In some cases several models may be created, evaluated and an enhanced (in some sense) result distilled. 7. Report findings or predict future outcomes. Finally, report the findings and/or use the generated data-mining models (for example to predict future example cases).
  • 5. MultiAgent Systems For Data Mining  Process automation. The current trend is towards automating as much of the data mining process as possible. "Even those not expert in data mining can reap the benefits of data-mining technologies," .  Task matching. Agents can automatically match algorithms to a desired data-mining objective; for instance, a clustering algorithm to create data clusters, or an association-rules algorithm to identify association rules.  Data and Task matching. Agents can automatically match mining tasks to relevant data.  Result evaluation. Agents can evaluate the accuracy of each model with respect to existing (training) data, and select a "best" (fit for purpose) model.  Privacy of sources. Individual owners of agents can preserve the privacy and security of their raw data and only share the results of data mining activities.  Extendibility. Flexibility to incorporate new data mining techniques and data sources (adding/removing agents)  Computational efficiency. Enhance the efficiency of parallel data mining processing.  Scalability and Resublity. MAS will offer access to a wider pool of available data and allow reuse to previously generated information.
  • 6. EMADS can (or will be able to) provide the following:  Enable and accelerate the deployment of practical solutions to data mining problems.  Provide a data mining agent space into which data mining agents of all sorts can be launched.  Allow anybody who is prepared to share their data to make that data available using an appropriately defined data agent.  Allow anybody who wishes to obtain data mining results the facility to launch an appropriately define query agent.  Allow anybody the facility to add a new data mining agent that can apply some data mining algorithm to data without significant specialised knowledge of multi-agent based data mining. EMADS (Extendible Multi-Agent Data mining System) Vision
  • 7. EMADS Meta ARM Application To illustrate some of the features of EMADS a Meta ARM scenario is considered in this paper. ARM Problem Definition  Given a database D we wish to find all the frequent itemsets (F) and then use this knowledge to produce high confidence association rules (ARs) of the form A→B.  Note: Finding F is the most computationally expensive part, once we have the frequent sets generating ARs is straight forward TID Atts. 1 2 3 4 5 1 2 1 2 3 2 3 4 2 3 4 3 4 5 6 4 5 1 2 2 4 5 2 3 2 4 3 4 2 3 3 Support threshold = 40% (count of 2.4) 2 2 4 Apriori Example (using a T-tree)
  • 8. Frequent Set Meta Mining  Meta ARM: A given ARM algorithm is applied to N raw data sets producing N collections of frequent item sets.  The objective is then to merge the different sets of results into a single meta set of frequent itemsets with the aim of generating a set of Association Rules (ARs).  Key issue: wherever an itemset is frequent in a data source A but not in a data source B a check for any contribution from data source B is required (to obtain a total support count). Meta ARM approach Data Mining Algorithm Data Mining Algorithm Data Mining Algorithm T-Tree T-Tree T-Tree Meta ARM Algorithm Merged T-Tree Legend Data Flow RTD 1 2 N
  • 9. Meta ARM algorithms  Issue in meta ARM is how best to combine the results from N different sources, in the most computationally efficient manner, into a single meta set of itemsets?  Meta ARM algorithms: We can identify a number of different strategies of combining the individually obtained T-Trees of N datasets into one global T- Tree making use of Return To Data lists (RTD) to obtain additional counts. 1. Brute Force: Merge T-trees one by one generating (N) RTD lists, pruning the T-tree at end of the merge process. 2. Apriori: Merge all T-trees level by level generating (K*N) RTD lists, pruning the T-tree at each level (K = number of levels). 3. Hybrid 1: Merge top level in the Apriori manner and the rest in BF manner. 4. Hybrid 2: Merge top two levels in the Apriori manner and the rest in BF manner. 5. A “bench mark” system is described to allow for appropriate comparison.
  • 10. Meta ARM algorithms Example  Brute Force Meta ARM   merge  inclusion of additional counts (RTD)  final prune. 2 4 2 3 2 2 2 3 2 4 2 4 1 2 1 2 3 2 1 2 1 2 2 4 2 1 6 2 BF RTD Lists {1} {4} DS2 DS1 Apriori RTD Lists (level 1) Dataset 2 (3 X 4) T-Tree T-Tree Merged T-Tree null {2,4} DS2 DS1 RTD Lists (level 2) Example: relative support =50% (3 records) Generated rules 2 → 4 (50%) 4 → 2 (100%) 1 2 4 2 2 2 • Apriori Meta ARM 1. K = 1 2. Generate candidate K-itemsets 3. Add supports for level K from N T- trees OR add to RTD list 4. Get additional support 5. Prune K-itemsets 6. K = K+1 7. Generate K-itemsets 8. End Loop relative support =50% (1.5 records count) Dataset 1 (3 X 4) 3 3 3 2 {1,2} {2,4} {1} {4} DS2 DS1
  • 11. Experimentation and Results 1. The number of data sources (data agents running on different machines ) - Each dataset has 100,000 transactions. - Support Threshold = 1%  Measured:  processing time,  the size of the RTD lists (messages’ size)  and the number of messages. 0 0.5 1 1.5 2 3 4 5 6 7 8 9 10 Number of Data Sets Time (sec) Apriori Bench Mark Hybrid & BF 0 50 100 150 200 2 3 4 5 6 7 8 9 10 Number of Data Sets Message Size (Kbyte) Apriori Bench Mark Hybrid & BF 0 10 20 30 40 50 2 3 4 5 6 7 8 9 10 Number of Data Sets Number Generated RTD Lists Apriori Brute Force Hybrid 1 Hybrid 2 Bench Mark
  • 12. Conclusion and Current Work  Conclusion  Multi-Agent Systems (MAS) can offer a well suited architecture for data mining in a decentralised, and anarchic manner so that much greater benefit can be obtained from current data mining capabilities and available data.  We described a extension of ARM where we build a meta set of frequent itemsets from a collection of component sets which have been generated in an autonomous manner without centralised control in a the EMADS environment.  Overall the Hybrid 2 algorithm is the best.  Current Work  Classification application  Data normalization (data pre-processing) agents  Wrapper agents: agents that can be used to incorporate new mining algorithms into EMADS without significant specialised knowledge of the system.