SlideShare uma empresa Scribd logo
1 de 12
Baixar para ler offline
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
459
PARAMETRIC COMPARISON BASED ON SPLIT CRITERION ON
CLASSIFICATION ALGORITHM IN STREAM DATA MINING
Ms. Madhu S. Shukla*, Dr.K.H.Wandra**, Mr. Kirit R. Rathod***
*(PG-CE Student, Department of Computer Engineering),
(C.U.Shah College of Engineering and Technology, Gujarat, India)
** (Principal, Department of Computer Engineering),
(C.U.Shah College of Engineering and Technology, Gujarat, India)
*** (Assistant Professor, Department of Computer Engineering)
ABSTRACT
Stream Data Mining is a new emerging topic in the field of research. Today, there are
number of application that generate Massive amount of stream data. Examples of such kind
of systems are Sensor networks, Real time surveillance systems, telecommunication systems.
Hence there is requirement of intelligent processing of such type of data that would help in
proper analysis and use of this data in other task even. Mining stream data is concerned with
extracting knowledge structures represented in models and patterns in non stopping streams
of information.
Classification process based on generating decision tree in stream data mining
that makes decision process easy. As per the characteristic of stream data, it becomes
essential to handle large amount of continuous and changing data with accuracy. In
classification process attribute selection at the non leaf decision node thus become a critical
analytic point. Various performance parameter’s like Speed of Classification, Accuracy, and
CPU Utilization time can be improved if split criterion is implemented precisely. This paper
presents implementation of different attribute selection criteria and their comparison with
alternative method.
Keywords: Stream, Stream Data Mining, Performance Parameter processing, MOA (Massive
Online Analysis), Split Criterion.
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING
& TECHNOLOGY (IJCET)
ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online)
Volume 4, Issue 2, March – April (2013), pp. 459-470
© IAEME: www.iaeme.com/ijcet.asp
Journal Impact Factor (2013): 6.1302 (Calculated by GISI)
www.jifactor.com
IJCET
© I A E M E
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
460
1. INTRODUCTION
Characteristic of stream data also act as challenges for the same. Due its huge size,
continuous nature, speed with which it changes, it requires a real time response which is done
after analysis of this type of data. As the data is huge in size algorithm which would access
the data is restricted for single scan of the data.
Data mining makes use of different types of algorithm for various types of mining
task like Classification, Clustering, and Pattern Recognition. Same way, Stream Data mining
also makes use of different types of algorithm for various types of mining task. Some of the
algorithm for Classification of Stream Data is Hoeffding Tree, VFDT (Very Fast decision
Tree, CVFDT (Concept adaptation Very Fast Decision Tree).These classification algorithm is
based on Hoeffding Bound for decision tree generation. It makes use of Hoeffding Bound to
gather optimum amount of data so that classification can be done accurately. CVFDT is the
algorithm which is able to detect concept drift which again is a challenge in stream data
mining. As the size of stream data is extremely large, a method is required for improving the
split criterion at the node of decision tree, so that the speed in tree generation is achieved
accuracy is improved and CPU utilization time is reduced. Two different types of split
criterion are checked for Stream data Classification in this paper. And thus improvement in
the algorithm based on it is done as a part of research work.
As said earlier, Stream Data is huge in size, so in order to perform certain analysis; we
need to take some sample of that data so that processing of stream data could be done with
ease. These samples taken should be such that whatever data comes in the portion of sample
is worth analyzing or processing, which means maximum knowledge is extracted from that
sampled data.
In this paper sampling technique used is adaptive sliding window in Hoeffding-Bound based
tree algorithm.
2. RELATED WORK
Implementing algorithm for Stream Data Classification demands improvement in
resource utilization as well as improvisation in accuracy with ongoing classification process.
Here, we would see improvement done on algorithm that is based on Concept Drift Detection
while doing the classification of the data. Drift Detection here is done using Windowing
Technique.
Sliding Window: It is an advance technique. It deals with detailed analysis over most recent
data items and over summarized versions of older ones.
The inspiration behind sliding window is that the user is more concerned with the analysis of
most recent data streams. Thus the detailed analysis is done over the most recent data items
and summarized versions of the old ones. This idea has been adopted in many techniques in
the undergoing comprehensive data stream mining system.
3. CLASSIFICATION PROCESS.
There are many data mining algorithms that exist in practice. Data mining algorithms
can be categorized in three types:
1. Classification
2. Clustering
3. Association
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
461
A standard classification system has normally three different phases:
1. The training phase, during which the model is built using labeled data.
2. The testing phase, during which the model is tested by measuring its classification
accuracy on withheld labeled data.
3. The deployment phase during which the model is used to predict the class of unlabelled
data. The three phases are carried out in sequence. See Figure 2.1 for the standard
classification phases.
Fig 3.1: Phases of standard classification systems
3.1. STREAM DATA MINING
Ordinary classification is usually considered in three phases. In the first phase, a
model is built using data, called the training data, for which the property of interest (the class)
is already known (labeled data). In the second phase, the model is used to predict the class of
data (test data), for which the property of interest is known, but which the model has not
previously seen. In the third phase, the model is deployed and used to predict the property of
interest for (unlabelled data).
In stream classification, there is only a single stream of data, having labeled and unlabelled
records occurring together in the stream. The training/test and deployment phases, therefore,
interleave. Stream classification of unlabelled records could be required from the beginning
of the stream, after some sufficiently long initial sequence of labeled records, or at specific
moments in time or for a specific block of records selected by an external analyst.
4. ATTRIBUTE SELECTION CRITERION IN DECISION TREE:
Selection of appropriate splitting criterion helps in improving performance measurement
dimensions. In data stream mining main three performance measurement dimensions:
- Accuracy
- Amount of space necessary or computer memory (Model cost or RAM hours)
- The time required to learn from training examples and to predict (Evaluation time)
These properties may be interdependent: adjusting the time and space used by an
algorithm can influence accuracy. By storing more pre-computed information, such as look
up tables, an algorithm can run faster at the expense of space. An algorithm can also run
faster by processing less information, either by stopping early or storing less, thus having less
data to process. The more time an algorithm has, the more likely it is that accuracy can be
increased.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
462
There are major two types of attribute selection criterion and they are Information
Gain and Gini Index. Later one is also known as binary split criterion. During late 1970s and
1980s .
J.Ross Quinlan, a researcher in machine learning has developed a decision tree
algorithm known as ID3 [1] (Iterative Dichotomiser). ID3 uses information gain for attribute
selection. Information gain Gain (A) is given as Gain (A) = Info (D) –InfoA (D).We have
developed a new algorithm to calculate information gain. Methodology wise this algorithm is
promising. We have divided the algorithm into two parts. The first part calculates Info (D)
and the second part calculates the Gain (A).
4.1. Information Gain Calculation: (information before split) – (information after split)
Entropy: A common way to measure impurity is entropy
• Entropy = Where pi is the
probability of class i.
Compute it as the proportion of class i in the set.
• Entropy comes from information theory. The higher the entropy the more the
information content.
• For Continuous data value is computed as (ai+ai+1+1)/2
787.0
17
4
log
17
4
17
13
log
17
13
22 =





⋅−





⋅−
Entire population (30 instances)
Information Gain= 0.996 - 0.615 = 0.38
391.0
13
12
log
13
12
13
1
log
13
1
22 =





⋅−





⋅−
Calculating Information Gain
17 instances
13 instances
Information Gain = entropy(parent) – [average entropy(children)]
996.0
30
16
log
30
16
30
14
log
30
14
22 =





⋅−





⋅−
(Weighted) Average Entropy of Children = 615.0391.0
30
13
787.0
30
17
=





⋅+





⋅
parent
entropy
child
entropy
child
entropy
Figure 4.1: Phases of standard classification systems
4.2. Calculating Gini Index
If a data set T contains examples from n classes, Gini index, Gini (T) is defined as
Where pj is the relative frequency of class j in T. Gini (T) is minimized if the classes in T are
skewed.
After splitting T into two subsets T1 and T2 with sizes N1 and N2, the Gini index of the split
data is defined as
The attribute providing smallest gin split(T) is chosen to split the node.
∑−
i
ii pp 2log
∑=
−=
n
j
j
pTgini
1
2
1)(
)()()( 2
2
1
1
T
N
T
Ngini gini
N
gini
N
T
split
+=
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
463
5. METHODOLOGY AND PROPOSED ALGORITHM
CVFDT (Concept Adaptation Very fast Decision Tree) is an extended version of
VFDT which provides same speed and accuracy advantages but if any changes occur in
example generating process provide the ability to detect and respond. Various systems with
this CVFDT uses sliding window of various dataset to keep its model consistent. In Most of
systems, it needs to learn a new model from scratch after arrival of new data. Instead,
CVFDT continuous monitors the quality of new data and adjusts those that are no longer
correct. Whenever new data arrives, CVFDT incrementing counts for new data and
decrements counts for oldest data in the window. The concept is stationary than there is no
statically effect. If the concept is changing, however, some splits examples that will no longer
appear best because new data provides more gain than previous one. Whenever this thing
occurs, CVFDT create alternative sub-tree to find best attribute at root. Each time new best
tree replaces old sub tree and it is more accurate on new data.
5.1 CVFDT ALGORITHM (Based on HoeffdingTree)
1. Alternate trees for each node in HT start as empty.
2. Process Examples from the stream indefinitely
3. For Each Example (x, y)
4. Pass (x, y) down to a set of leaves using HT And all alternate trees of the nodes (x, y) pass
Through.
5. Add(x, y) To the sliding window of examples.
6. Remove and forget the effect of the oldest Examples, if the sliding window overflows.
7. CVFDT Grow
8. Check Split Validity if f examples seen since Last checking of alternate trees.
9. Return HT.
Fig: 5.1 Flow of CVFDT algorithm
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
464
6. EXPERIMENTAL ANALYSIS WITH OBSERVATION
Different types of dataset were taken and the algorithm of CVFDT was implemented
after Importing those data set to in MOA. Performance analysis of various split criterion used
in decision tree approach are also tested for improving the accuracy of the algorithm. Datasets
used here are in ARFF format. Some of the data are taken from Repository of California
University, some from projects of Spain which are working on Stream Data.
Data Sets taken were as follows:
1) Sensor
2) Sea
3) Random Tree generator.
The Readings taken here are for Sensor data. It contains information (temperature,
humidity, light, and sensor voltage) collected from 54 sensors deployed in Intel Berkeley
Research Lab. The whole stream contains consecutive information recorded over a 2 months
period (1 reading per 1-3 minutes). I used the sensor ID as the class label, so the learning task
of the stream is to correctly identify the sensor ID (1 out of 54 sensors) purely based on the
sensor data and the corresponding recording time. While the data stream flow over time, so
does the concepts underlying the stream. For example, the lighting during the working hours
is generally stronger than the night, and the temperature of specific sensors (conference room)
may regularly rise during the meetings.
Fig: 6.1 MIT Computer Science and Artificial Intelligence Lab data repository
As discussed above an attribute selection measure is a heuristic for selecting the splitting criterion
that “best” separates a given Data. Two common methods used for it are:
1) Entropy based method (i.e. Information Gain)
2) Gini Index
6.1 RANDOM TREE GENERATOR DATA SET RESULTS
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
465
Instance Information
Gain(Accuracy)
Gini
Index(Accuracy)
100000 92.6 81.7
200000 93 83
300000 94.7 80.1
400000 96.3 82.2
500000 94.8 80.9
600000 96.9 81.9
700000 96.9 82.6
800000 96.7 82.1
900000 98.7 84
1000000 97.4 77.9
Table-I: Comparison for accuracy in random tree generator
6.2 SEA DATA SET RESULTS
Instance Information
Gain(Accuracy)
Gini
Index(Accuracy)
100000 89.8 89.3
200000 92.1 91.6
300000 89.6 89.3
400000 89.1 88.9
500000 88.5 88.5
600000 88.8 88.1
700000 90.6 90.6
800000 89.5 89.3
900000 89.1 89
1000000 89.9 89.9
Table-II: Comparison for accuracy for SEA Data
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
466
6.3 PERFORMANCE ANALYSIS BASED ON SENSOR DATA SET (CPU
UTILIZATION)
Learning evaluation
instances
Evaluation time (Cpu
seconds) Info gain
Evaluation time (Cpu
seconds)Gini index
100000 6.676843 8.704856
200000 13.46289 18.67332
300000 20.23333 29.40619
400000 26.97257 39.87386
500000 33.68062 49.63952
600000 40.40426 59.06198
700000 47.0499 67.70443
800000 53.74234 78.0941
900000 59.93558 88.14057
1000000 66.79963 98.48343
1100000 73.27367 107.1727
1200000 79.27971 116.9851
1300000 85.53535 127.016
1400000 91.99379 136.6257
1500000 98.40543 145.2993
1600000 104.3803 152.9278
1700000 110.3083 160.0102
1800000 116.4859 168.1223
1900000 121.9928 174.8459
Table-III: Comparison of CPU Utilization time for SENSOR Data
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
467
6.4 PERFORMANCE ANALYSIS BASED ON SENSOR DATA SET (ACCURACY)
Learning evaluation
instances
Classifications correct
(percent)Info Gain
Classifications correct
(percent)Gini Index
100000 96.3 98.4
200000 68.3 69.7
300000 18 64.4
400000 43.2 67.4
500000 62.8 72.9
600000 92 71
700000 97.9 72.5
800000 97.4 73.9
900000 96.8 73.7
1000000 80.6 68.5
1100000 53.6 71.2
1200000 71 90.3
1300000 84.1 73.1
1400000 78.5 83.9
1500000 96.3 84.9
1600000 50.9 84.9
1700000 24 79
1800000 74.3 87.6
1900000 98 97.8
Table-IV: Comparison of ACCURACY for SENSOR Data
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
468
6.5 PERFORMANCE ANALYSIS BASED ON SENSOR DATA SET (TREE SIZE)
Learning evaluation
instances
Tree size (nodes) Info
Gain
Tree size (nodes) Gini
Index
100000 14 126
200000 30 270
300000 44 396
400000 60 530
500000 76 666
600000 88 800
700000 102 938
800000 122 1076
900000 136 1214
1000000 150 1346
1100000 172 1466
1200000 196 1602
1300000 216 1742
1400000 226 1868
1500000 240 1998
1600000 262 2122
1700000 282 2238
1800000 292 2352
1900000 312 2474
Table-V: Comparison of TREE SIZE for SENSOR Data)
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
469
6.6 PERFORMANCE ANALYSIS BASED ON SENSOR DATA SET (LEAVES)
Learning evaluation
instances
Tree size (leaves) Info
Gain Tree size (leaves) Gini Index
100000 7 63
200000 15 135
300000 22 198
400000 30 265
500000 38 333
600000 44 400
700000 51 469
800000 61 538
900000 68 607
1000000 75 673
1100000 86 733
1200000 98 801
1300000 108 871
1400000 113 934
1500000 120 999
1600000 131 1061
1700000 141 1119
1800000 146 1176
1900000 156 1237
Table-IV: Comparison of LEAVES for SENSOR Data)
6.7 COMPARISION OF ALL DIMENSION OF PERFORMANCE TOGETHER
FOR SENSOR DATA
Fig 6.2: Comparison of Performance for Sensor Data for every dimension together
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME
470
7. CONCLUSION
In this paper, we discussed about theoretical aspects and practical results of Stream
Data Mining Classification algorithms with different split criterion. The comparison based on
different dataset shows the result analysis. Hoeffding trees with windowing technique spend
least amount of time for learning and results in higher accuracy than Gini Index. Memory
utilization, Accuracy and CPU Utilization which are crucial factor in Stream Data are
practically discussed here in this paper with observation. Classification generates decision
tree and tree generated with Split Criterion as Information gain shows that size of tree is also
decreased as shown in table along with dramatic change in accuracy and CPU Utilization.
REFERENCES
[1] Elena ikonomovska,Suzana Loskovska,Dejan Gjorgjevik, “A Survey Of Stream Data
Mining” Eight National Conference with International Participation-ETAI2007
[2] S.Muthukrishnan, “Data streams: Algorithms and Applications”.Proceeding of the
fourteenth annual ACM-SIAM symposium on discrete algorithms,2003
[3] Mohamed Medhat Gaber, Arkady Zaslavsky and Shonali Krishnaswamy. ]“Mining Data
Streams: A Review”, Centre for Distributed Systems and Software Engineering, Monash
University900 Dandenong Rd, Caulfield East, VIC3145, Australia
[4] P. Domingos and G. Hulten, “A General Method for Scaling Up Machine Learning
Algorithms and its Application to Clustering”, Proceedings of the Eighteenth International
Conference on Machine Learning, 2001, Williamstown, MA, Morgan Kaufmann
[5] H. Kargupta, R. Bhargava, K. Liu, M. Powers, P.Blair, S. Bushra, J. Dull, K. Sarkar, M.
Klein, M. Vasa, and D. Handy, VEDAS: “A Mobile and Distributed Data Stream Mining
System for Real-Time Vehicle Monitoring”, Proceedings of SIAM International Conference
on Data Mining, 2004.
[6]“Adaptive Parameter-free Learning from Evolving Data Streams”, Albert Bifet and Ricard
Gavald`a, Universitat Polit`ecnica de Catalunya, Barcelona, Spain.
[7] “Mining Stream with Concept Drift”, Dariusz Brzezinski, Master’s thesis, Poznan
University of Technology
[8] R. Manickam, D. Boominath and V. Bhuvaneswari, “An Analysis of Data Mining: Past,
Present and Future”, International journal of Computer Engineering & Technology (IJCET),
Volume 3, Issue 1, 2012, pp. 1 - 9, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375
[9] Mr. M. Karthikeyan, Mr. M. Suriya Kumar and Dr. S. Karthikeyan, “A Literature Review
on the Data Mining And Information Security”, International journal of Computer
Engineering & Technology (IJCET), Volume 3, Issue 1, 2012, pp. 141 - 146, ISSN Print:
0976 – 6367, ISSN Online: 0976 – 6375

Mais conteúdo relacionado

Mais procurados

Comparative analysis of various data stream mining procedures and various dim...
Comparative analysis of various data stream mining procedures and various dim...Comparative analysis of various data stream mining procedures and various dim...
Comparative analysis of various data stream mining procedures and various dim...Alexander Decker
 
A Firefly based improved clustering algorithm
A Firefly based improved clustering algorithmA Firefly based improved clustering algorithm
A Firefly based improved clustering algorithmIRJET Journal
 
Data mining techniques application for prediction in OLAP cube
Data mining techniques application for prediction in OLAP cubeData mining techniques application for prediction in OLAP cube
Data mining techniques application for prediction in OLAP cubeIJECEIAES
 
Data performance characterization of frequent pattern mining algorithms
Data performance characterization of frequent pattern mining algorithmsData performance characterization of frequent pattern mining algorithms
Data performance characterization of frequent pattern mining algorithmsIJDKP
 
Predicting performance of classification algorithms
Predicting performance of classification algorithmsPredicting performance of classification algorithms
Predicting performance of classification algorithmsIAEME Publication
 
PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMS
PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMSPREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMS
PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMSSamsung Electronics
 
Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...
Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...
Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...AM Publications
 
Effective data mining for proper
Effective data mining for properEffective data mining for proper
Effective data mining for properIJDKP
 
USING ONTOLOGIES TO IMPROVE DOCUMENT CLASSIFICATION WITH TRANSDUCTIVE SUPPORT...
USING ONTOLOGIES TO IMPROVE DOCUMENT CLASSIFICATION WITH TRANSDUCTIVE SUPPORT...USING ONTOLOGIES TO IMPROVE DOCUMENT CLASSIFICATION WITH TRANSDUCTIVE SUPPORT...
USING ONTOLOGIES TO IMPROVE DOCUMENT CLASSIFICATION WITH TRANSDUCTIVE SUPPORT...IJDKP
 
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MININGPATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MININGIJDKP
 
IRJET- Analyze Weather Condition using Machine Learning Algorithms
IRJET-  	  Analyze Weather Condition using Machine Learning AlgorithmsIRJET-  	  Analyze Weather Condition using Machine Learning Algorithms
IRJET- Analyze Weather Condition using Machine Learning AlgorithmsIRJET Journal
 
A CONCEPTUAL METADATA FRAMEWORK FOR SPATIAL DATA WAREHOUSE
A CONCEPTUAL METADATA FRAMEWORK FOR SPATIAL DATA WAREHOUSEA CONCEPTUAL METADATA FRAMEWORK FOR SPATIAL DATA WAREHOUSE
A CONCEPTUAL METADATA FRAMEWORK FOR SPATIAL DATA WAREHOUSEIJDKP
 
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...theijes
 
Data Imputation by Soft Computing
Data Imputation by Soft ComputingData Imputation by Soft Computing
Data Imputation by Soft Computingijtsrd
 
GCUBE INDEXING
GCUBE INDEXINGGCUBE INDEXING
GCUBE INDEXINGIJDKP
 

Mais procurados (18)

Comparative analysis of various data stream mining procedures and various dim...
Comparative analysis of various data stream mining procedures and various dim...Comparative analysis of various data stream mining procedures and various dim...
Comparative analysis of various data stream mining procedures and various dim...
 
A1802050102
A1802050102A1802050102
A1802050102
 
A Firefly based improved clustering algorithm
A Firefly based improved clustering algorithmA Firefly based improved clustering algorithm
A Firefly based improved clustering algorithm
 
K-MEANS AND D-STREAM ALGORITHM IN HEALTHCARE
K-MEANS AND D-STREAM ALGORITHM IN HEALTHCAREK-MEANS AND D-STREAM ALGORITHM IN HEALTHCARE
K-MEANS AND D-STREAM ALGORITHM IN HEALTHCARE
 
Data mining techniques application for prediction in OLAP cube
Data mining techniques application for prediction in OLAP cubeData mining techniques application for prediction in OLAP cube
Data mining techniques application for prediction in OLAP cube
 
Data performance characterization of frequent pattern mining algorithms
Data performance characterization of frequent pattern mining algorithmsData performance characterization of frequent pattern mining algorithms
Data performance characterization of frequent pattern mining algorithms
 
Ay4201347349
Ay4201347349Ay4201347349
Ay4201347349
 
Predicting performance of classification algorithms
Predicting performance of classification algorithmsPredicting performance of classification algorithms
Predicting performance of classification algorithms
 
PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMS
PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMSPREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMS
PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMS
 
Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...
Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...
Review: Data Driven Traffic Flow Forecasting using MapReduce in Distributed M...
 
Effective data mining for proper
Effective data mining for properEffective data mining for proper
Effective data mining for proper
 
USING ONTOLOGIES TO IMPROVE DOCUMENT CLASSIFICATION WITH TRANSDUCTIVE SUPPORT...
USING ONTOLOGIES TO IMPROVE DOCUMENT CLASSIFICATION WITH TRANSDUCTIVE SUPPORT...USING ONTOLOGIES TO IMPROVE DOCUMENT CLASSIFICATION WITH TRANSDUCTIVE SUPPORT...
USING ONTOLOGIES TO IMPROVE DOCUMENT CLASSIFICATION WITH TRANSDUCTIVE SUPPORT...
 
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MININGPATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
 
IRJET- Analyze Weather Condition using Machine Learning Algorithms
IRJET-  	  Analyze Weather Condition using Machine Learning AlgorithmsIRJET-  	  Analyze Weather Condition using Machine Learning Algorithms
IRJET- Analyze Weather Condition using Machine Learning Algorithms
 
A CONCEPTUAL METADATA FRAMEWORK FOR SPATIAL DATA WAREHOUSE
A CONCEPTUAL METADATA FRAMEWORK FOR SPATIAL DATA WAREHOUSEA CONCEPTUAL METADATA FRAMEWORK FOR SPATIAL DATA WAREHOUSE
A CONCEPTUAL METADATA FRAMEWORK FOR SPATIAL DATA WAREHOUSE
 
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
 
Data Imputation by Soft Computing
Data Imputation by Soft ComputingData Imputation by Soft Computing
Data Imputation by Soft Computing
 
GCUBE INDEXING
GCUBE INDEXINGGCUBE INDEXING
GCUBE INDEXING
 

Destaque

Null hypothesis for One way RM ANOVA
Null hypothesis for One way RM ANOVANull hypothesis for One way RM ANOVA
Null hypothesis for One way RM ANOVAKen Plummer
 
What is a one-way repeated measures ANOVA?
What is a one-way repeated measures ANOVA?What is a one-way repeated measures ANOVA?
What is a one-way repeated measures ANOVA?Ken Plummer
 
Null hypothesis for a one-way anova
Null hypothesis for a one-way anovaNull hypothesis for a one-way anova
Null hypothesis for a one-way anovaKen Plummer
 
One way repeated measure anova
One way repeated measure anovaOne way repeated measure anova
One way repeated measure anovaAamna Haneef
 
Reporting a one way repeated measures anova
Reporting a one way repeated measures anovaReporting a one way repeated measures anova
Reporting a one way repeated measures anovaKen Plummer
 
MICROCONTROLLER BASED SOLAR POWER INVERTER
MICROCONTROLLER BASED SOLAR POWER INVERTERMICROCONTROLLER BASED SOLAR POWER INVERTER
MICROCONTROLLER BASED SOLAR POWER INVERTERIAEME Publication
 

Destaque (6)

Null hypothesis for One way RM ANOVA
Null hypothesis for One way RM ANOVANull hypothesis for One way RM ANOVA
Null hypothesis for One way RM ANOVA
 
What is a one-way repeated measures ANOVA?
What is a one-way repeated measures ANOVA?What is a one-way repeated measures ANOVA?
What is a one-way repeated measures ANOVA?
 
Null hypothesis for a one-way anova
Null hypothesis for a one-way anovaNull hypothesis for a one-way anova
Null hypothesis for a one-way anova
 
One way repeated measure anova
One way repeated measure anovaOne way repeated measure anova
One way repeated measure anova
 
Reporting a one way repeated measures anova
Reporting a one way repeated measures anovaReporting a one way repeated measures anova
Reporting a one way repeated measures anova
 
MICROCONTROLLER BASED SOLAR POWER INVERTER
MICROCONTROLLER BASED SOLAR POWER INVERTERMICROCONTROLLER BASED SOLAR POWER INVERTER
MICROCONTROLLER BASED SOLAR POWER INVERTER
 

Semelhante a Parametric comparison based on split criterion on classification algorithm

Evaluation of a New Incremental Classification Tree Algorithm for Mining High...
Evaluation of a New Incremental Classification Tree Algorithm for Mining High...Evaluation of a New Incremental Classification Tree Algorithm for Mining High...
Evaluation of a New Incremental Classification Tree Algorithm for Mining High...mlaij
 
Development of pattern knowledge discovery framework using
Development of pattern knowledge discovery framework usingDevelopment of pattern knowledge discovery framework using
Development of pattern knowledge discovery framework usingIAEME Publication
 
A STUDY OF TRADITIONAL DATA ANALYSIS AND SENSOR DATA ANALYTICS
A STUDY OF TRADITIONAL DATA ANALYSIS AND SENSOR DATA ANALYTICSA STUDY OF TRADITIONAL DATA ANALYSIS AND SENSOR DATA ANALYTICS
A STUDY OF TRADITIONAL DATA ANALYSIS AND SENSOR DATA ANALYTICSijistjournal
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYEditor IJMTER
 
Anomalous symmetry succession for seek out
Anomalous symmetry succession for seek outAnomalous symmetry succession for seek out
Anomalous symmetry succession for seek outiaemedu
 
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCEAPPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCEIJCSEA Journal
 
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCEAPPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCEIJCSEA Journal
 
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCEAPPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCEIJCSEA Journal
 
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCEAPPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCEIJCSEA Journal
 
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCEAPPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCEIJCSEA Journal
 
Distributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebDistributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebEditor IJCATR
 
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE cscpconf
 
Fault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clusteringFault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clusteringIRJET Journal
 
A study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanismsA study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanismseSAT Journals
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETEditor IJMTER
 
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUESGI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUESAM Publications
 

Semelhante a Parametric comparison based on split criterion on classification algorithm (20)

Aa31163168
Aa31163168Aa31163168
Aa31163168
 
Evaluation of a New Incremental Classification Tree Algorithm for Mining High...
Evaluation of a New Incremental Classification Tree Algorithm for Mining High...Evaluation of a New Incremental Classification Tree Algorithm for Mining High...
Evaluation of a New Incremental Classification Tree Algorithm for Mining High...
 
Development of pattern knowledge discovery framework using
Development of pattern knowledge discovery framework usingDevelopment of pattern knowledge discovery framework using
Development of pattern knowledge discovery framework using
 
A STUDY OF TRADITIONAL DATA ANALYSIS AND SENSOR DATA ANALYTICS
A STUDY OF TRADITIONAL DATA ANALYSIS AND SENSOR DATA ANALYTICSA STUDY OF TRADITIONAL DATA ANALYSIS AND SENSOR DATA ANALYTICS
A STUDY OF TRADITIONAL DATA ANALYSIS AND SENSOR DATA ANALYTICS
 
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYCLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
 
Anomalous symmetry succession for seek out
Anomalous symmetry succession for seek outAnomalous symmetry succession for seek out
Anomalous symmetry succession for seek out
 
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCEAPPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
 
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCEAPPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
 
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCEAPPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
 
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCEAPPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
 
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCEAPPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
 
Distributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic WebDistributed Digital Artifacts on the Semantic Web
Distributed Digital Artifacts on the Semantic Web
 
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
APPLICATION OF DYNAMIC CLUSTERING ALGORITHM IN MEDICAL SURVEILLANCE
 
Fault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clusteringFault detection of imbalanced data using incremental clustering
Fault detection of imbalanced data using incremental clustering
 
50120130406041 2
50120130406041 250120130406041 2
50120130406041 2
 
50120140505015 2
50120140505015 250120140505015 2
50120140505015 2
 
[IJET-V1I3P11] Authors : Hemangi Bhalekar, Swati Kumbhar, Hiral Mewada, Prati...
[IJET-V1I3P11] Authors : Hemangi Bhalekar, Swati Kumbhar, Hiral Mewada, Prati...[IJET-V1I3P11] Authors : Hemangi Bhalekar, Swati Kumbhar, Hiral Mewada, Prati...
[IJET-V1I3P11] Authors : Hemangi Bhalekar, Swati Kumbhar, Hiral Mewada, Prati...
 
A study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanismsA study and survey on various progressive duplicate detection mechanisms
A study and survey on various progressive duplicate detection mechanisms
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
 
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUESGI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
GI-ANFIS APPROACH FOR ENVISAGE HEART ATTACK DISEASE USING DATA MINING TECHNIQUES
 

Mais de IAEME Publication

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME Publication
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...IAEME Publication
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSIAEME Publication
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSIAEME Publication
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSIAEME Publication
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSIAEME Publication
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOIAEME Publication
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IAEME Publication
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYIAEME Publication
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...IAEME Publication
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEIAEME Publication
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...IAEME Publication
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...IAEME Publication
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...IAEME Publication
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...IAEME Publication
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...IAEME Publication
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...IAEME Publication
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...IAEME Publication
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...IAEME Publication
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTIAEME Publication
 

Mais de IAEME Publication (20)

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdf
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
 

Último

Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 

Último (20)

Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Parametric comparison based on split criterion on classification algorithm

  • 1. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 459 PARAMETRIC COMPARISON BASED ON SPLIT CRITERION ON CLASSIFICATION ALGORITHM IN STREAM DATA MINING Ms. Madhu S. Shukla*, Dr.K.H.Wandra**, Mr. Kirit R. Rathod*** *(PG-CE Student, Department of Computer Engineering), (C.U.Shah College of Engineering and Technology, Gujarat, India) ** (Principal, Department of Computer Engineering), (C.U.Shah College of Engineering and Technology, Gujarat, India) *** (Assistant Professor, Department of Computer Engineering) ABSTRACT Stream Data Mining is a new emerging topic in the field of research. Today, there are number of application that generate Massive amount of stream data. Examples of such kind of systems are Sensor networks, Real time surveillance systems, telecommunication systems. Hence there is requirement of intelligent processing of such type of data that would help in proper analysis and use of this data in other task even. Mining stream data is concerned with extracting knowledge structures represented in models and patterns in non stopping streams of information. Classification process based on generating decision tree in stream data mining that makes decision process easy. As per the characteristic of stream data, it becomes essential to handle large amount of continuous and changing data with accuracy. In classification process attribute selection at the non leaf decision node thus become a critical analytic point. Various performance parameter’s like Speed of Classification, Accuracy, and CPU Utilization time can be improved if split criterion is implemented precisely. This paper presents implementation of different attribute selection criteria and their comparison with alternative method. Keywords: Stream, Stream Data Mining, Performance Parameter processing, MOA (Massive Online Analysis), Split Criterion. INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) ISSN 0976 – 6367(Print) ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), pp. 459-470 © IAEME: www.iaeme.com/ijcet.asp Journal Impact Factor (2013): 6.1302 (Calculated by GISI) www.jifactor.com IJCET © I A E M E
  • 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 460 1. INTRODUCTION Characteristic of stream data also act as challenges for the same. Due its huge size, continuous nature, speed with which it changes, it requires a real time response which is done after analysis of this type of data. As the data is huge in size algorithm which would access the data is restricted for single scan of the data. Data mining makes use of different types of algorithm for various types of mining task like Classification, Clustering, and Pattern Recognition. Same way, Stream Data mining also makes use of different types of algorithm for various types of mining task. Some of the algorithm for Classification of Stream Data is Hoeffding Tree, VFDT (Very Fast decision Tree, CVFDT (Concept adaptation Very Fast Decision Tree).These classification algorithm is based on Hoeffding Bound for decision tree generation. It makes use of Hoeffding Bound to gather optimum amount of data so that classification can be done accurately. CVFDT is the algorithm which is able to detect concept drift which again is a challenge in stream data mining. As the size of stream data is extremely large, a method is required for improving the split criterion at the node of decision tree, so that the speed in tree generation is achieved accuracy is improved and CPU utilization time is reduced. Two different types of split criterion are checked for Stream data Classification in this paper. And thus improvement in the algorithm based on it is done as a part of research work. As said earlier, Stream Data is huge in size, so in order to perform certain analysis; we need to take some sample of that data so that processing of stream data could be done with ease. These samples taken should be such that whatever data comes in the portion of sample is worth analyzing or processing, which means maximum knowledge is extracted from that sampled data. In this paper sampling technique used is adaptive sliding window in Hoeffding-Bound based tree algorithm. 2. RELATED WORK Implementing algorithm for Stream Data Classification demands improvement in resource utilization as well as improvisation in accuracy with ongoing classification process. Here, we would see improvement done on algorithm that is based on Concept Drift Detection while doing the classification of the data. Drift Detection here is done using Windowing Technique. Sliding Window: It is an advance technique. It deals with detailed analysis over most recent data items and over summarized versions of older ones. The inspiration behind sliding window is that the user is more concerned with the analysis of most recent data streams. Thus the detailed analysis is done over the most recent data items and summarized versions of the old ones. This idea has been adopted in many techniques in the undergoing comprehensive data stream mining system. 3. CLASSIFICATION PROCESS. There are many data mining algorithms that exist in practice. Data mining algorithms can be categorized in three types: 1. Classification 2. Clustering 3. Association
  • 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 461 A standard classification system has normally three different phases: 1. The training phase, during which the model is built using labeled data. 2. The testing phase, during which the model is tested by measuring its classification accuracy on withheld labeled data. 3. The deployment phase during which the model is used to predict the class of unlabelled data. The three phases are carried out in sequence. See Figure 2.1 for the standard classification phases. Fig 3.1: Phases of standard classification systems 3.1. STREAM DATA MINING Ordinary classification is usually considered in three phases. In the first phase, a model is built using data, called the training data, for which the property of interest (the class) is already known (labeled data). In the second phase, the model is used to predict the class of data (test data), for which the property of interest is known, but which the model has not previously seen. In the third phase, the model is deployed and used to predict the property of interest for (unlabelled data). In stream classification, there is only a single stream of data, having labeled and unlabelled records occurring together in the stream. The training/test and deployment phases, therefore, interleave. Stream classification of unlabelled records could be required from the beginning of the stream, after some sufficiently long initial sequence of labeled records, or at specific moments in time or for a specific block of records selected by an external analyst. 4. ATTRIBUTE SELECTION CRITERION IN DECISION TREE: Selection of appropriate splitting criterion helps in improving performance measurement dimensions. In data stream mining main three performance measurement dimensions: - Accuracy - Amount of space necessary or computer memory (Model cost or RAM hours) - The time required to learn from training examples and to predict (Evaluation time) These properties may be interdependent: adjusting the time and space used by an algorithm can influence accuracy. By storing more pre-computed information, such as look up tables, an algorithm can run faster at the expense of space. An algorithm can also run faster by processing less information, either by stopping early or storing less, thus having less data to process. The more time an algorithm has, the more likely it is that accuracy can be increased.
  • 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 462 There are major two types of attribute selection criterion and they are Information Gain and Gini Index. Later one is also known as binary split criterion. During late 1970s and 1980s . J.Ross Quinlan, a researcher in machine learning has developed a decision tree algorithm known as ID3 [1] (Iterative Dichotomiser). ID3 uses information gain for attribute selection. Information gain Gain (A) is given as Gain (A) = Info (D) –InfoA (D).We have developed a new algorithm to calculate information gain. Methodology wise this algorithm is promising. We have divided the algorithm into two parts. The first part calculates Info (D) and the second part calculates the Gain (A). 4.1. Information Gain Calculation: (information before split) – (information after split) Entropy: A common way to measure impurity is entropy • Entropy = Where pi is the probability of class i. Compute it as the proportion of class i in the set. • Entropy comes from information theory. The higher the entropy the more the information content. • For Continuous data value is computed as (ai+ai+1+1)/2 787.0 17 4 log 17 4 17 13 log 17 13 22 =      ⋅−      ⋅− Entire population (30 instances) Information Gain= 0.996 - 0.615 = 0.38 391.0 13 12 log 13 12 13 1 log 13 1 22 =      ⋅−      ⋅− Calculating Information Gain 17 instances 13 instances Information Gain = entropy(parent) – [average entropy(children)] 996.0 30 16 log 30 16 30 14 log 30 14 22 =      ⋅−      ⋅− (Weighted) Average Entropy of Children = 615.0391.0 30 13 787.0 30 17 =      ⋅+      ⋅ parent entropy child entropy child entropy Figure 4.1: Phases of standard classification systems 4.2. Calculating Gini Index If a data set T contains examples from n classes, Gini index, Gini (T) is defined as Where pj is the relative frequency of class j in T. Gini (T) is minimized if the classes in T are skewed. After splitting T into two subsets T1 and T2 with sizes N1 and N2, the Gini index of the split data is defined as The attribute providing smallest gin split(T) is chosen to split the node. ∑− i ii pp 2log ∑= −= n j j pTgini 1 2 1)( )()()( 2 2 1 1 T N T Ngini gini N gini N T split +=
  • 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 463 5. METHODOLOGY AND PROPOSED ALGORITHM CVFDT (Concept Adaptation Very fast Decision Tree) is an extended version of VFDT which provides same speed and accuracy advantages but if any changes occur in example generating process provide the ability to detect and respond. Various systems with this CVFDT uses sliding window of various dataset to keep its model consistent. In Most of systems, it needs to learn a new model from scratch after arrival of new data. Instead, CVFDT continuous monitors the quality of new data and adjusts those that are no longer correct. Whenever new data arrives, CVFDT incrementing counts for new data and decrements counts for oldest data in the window. The concept is stationary than there is no statically effect. If the concept is changing, however, some splits examples that will no longer appear best because new data provides more gain than previous one. Whenever this thing occurs, CVFDT create alternative sub-tree to find best attribute at root. Each time new best tree replaces old sub tree and it is more accurate on new data. 5.1 CVFDT ALGORITHM (Based on HoeffdingTree) 1. Alternate trees for each node in HT start as empty. 2. Process Examples from the stream indefinitely 3. For Each Example (x, y) 4. Pass (x, y) down to a set of leaves using HT And all alternate trees of the nodes (x, y) pass Through. 5. Add(x, y) To the sliding window of examples. 6. Remove and forget the effect of the oldest Examples, if the sliding window overflows. 7. CVFDT Grow 8. Check Split Validity if f examples seen since Last checking of alternate trees. 9. Return HT. Fig: 5.1 Flow of CVFDT algorithm
  • 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 464 6. EXPERIMENTAL ANALYSIS WITH OBSERVATION Different types of dataset were taken and the algorithm of CVFDT was implemented after Importing those data set to in MOA. Performance analysis of various split criterion used in decision tree approach are also tested for improving the accuracy of the algorithm. Datasets used here are in ARFF format. Some of the data are taken from Repository of California University, some from projects of Spain which are working on Stream Data. Data Sets taken were as follows: 1) Sensor 2) Sea 3) Random Tree generator. The Readings taken here are for Sensor data. It contains information (temperature, humidity, light, and sensor voltage) collected from 54 sensors deployed in Intel Berkeley Research Lab. The whole stream contains consecutive information recorded over a 2 months period (1 reading per 1-3 minutes). I used the sensor ID as the class label, so the learning task of the stream is to correctly identify the sensor ID (1 out of 54 sensors) purely based on the sensor data and the corresponding recording time. While the data stream flow over time, so does the concepts underlying the stream. For example, the lighting during the working hours is generally stronger than the night, and the temperature of specific sensors (conference room) may regularly rise during the meetings. Fig: 6.1 MIT Computer Science and Artificial Intelligence Lab data repository As discussed above an attribute selection measure is a heuristic for selecting the splitting criterion that “best” separates a given Data. Two common methods used for it are: 1) Entropy based method (i.e. Information Gain) 2) Gini Index 6.1 RANDOM TREE GENERATOR DATA SET RESULTS
  • 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 465 Instance Information Gain(Accuracy) Gini Index(Accuracy) 100000 92.6 81.7 200000 93 83 300000 94.7 80.1 400000 96.3 82.2 500000 94.8 80.9 600000 96.9 81.9 700000 96.9 82.6 800000 96.7 82.1 900000 98.7 84 1000000 97.4 77.9 Table-I: Comparison for accuracy in random tree generator 6.2 SEA DATA SET RESULTS Instance Information Gain(Accuracy) Gini Index(Accuracy) 100000 89.8 89.3 200000 92.1 91.6 300000 89.6 89.3 400000 89.1 88.9 500000 88.5 88.5 600000 88.8 88.1 700000 90.6 90.6 800000 89.5 89.3 900000 89.1 89 1000000 89.9 89.9 Table-II: Comparison for accuracy for SEA Data
  • 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 466 6.3 PERFORMANCE ANALYSIS BASED ON SENSOR DATA SET (CPU UTILIZATION) Learning evaluation instances Evaluation time (Cpu seconds) Info gain Evaluation time (Cpu seconds)Gini index 100000 6.676843 8.704856 200000 13.46289 18.67332 300000 20.23333 29.40619 400000 26.97257 39.87386 500000 33.68062 49.63952 600000 40.40426 59.06198 700000 47.0499 67.70443 800000 53.74234 78.0941 900000 59.93558 88.14057 1000000 66.79963 98.48343 1100000 73.27367 107.1727 1200000 79.27971 116.9851 1300000 85.53535 127.016 1400000 91.99379 136.6257 1500000 98.40543 145.2993 1600000 104.3803 152.9278 1700000 110.3083 160.0102 1800000 116.4859 168.1223 1900000 121.9928 174.8459 Table-III: Comparison of CPU Utilization time for SENSOR Data
  • 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 467 6.4 PERFORMANCE ANALYSIS BASED ON SENSOR DATA SET (ACCURACY) Learning evaluation instances Classifications correct (percent)Info Gain Classifications correct (percent)Gini Index 100000 96.3 98.4 200000 68.3 69.7 300000 18 64.4 400000 43.2 67.4 500000 62.8 72.9 600000 92 71 700000 97.9 72.5 800000 97.4 73.9 900000 96.8 73.7 1000000 80.6 68.5 1100000 53.6 71.2 1200000 71 90.3 1300000 84.1 73.1 1400000 78.5 83.9 1500000 96.3 84.9 1600000 50.9 84.9 1700000 24 79 1800000 74.3 87.6 1900000 98 97.8 Table-IV: Comparison of ACCURACY for SENSOR Data
  • 10. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 468 6.5 PERFORMANCE ANALYSIS BASED ON SENSOR DATA SET (TREE SIZE) Learning evaluation instances Tree size (nodes) Info Gain Tree size (nodes) Gini Index 100000 14 126 200000 30 270 300000 44 396 400000 60 530 500000 76 666 600000 88 800 700000 102 938 800000 122 1076 900000 136 1214 1000000 150 1346 1100000 172 1466 1200000 196 1602 1300000 216 1742 1400000 226 1868 1500000 240 1998 1600000 262 2122 1700000 282 2238 1800000 292 2352 1900000 312 2474 Table-V: Comparison of TREE SIZE for SENSOR Data)
  • 11. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 469 6.6 PERFORMANCE ANALYSIS BASED ON SENSOR DATA SET (LEAVES) Learning evaluation instances Tree size (leaves) Info Gain Tree size (leaves) Gini Index 100000 7 63 200000 15 135 300000 22 198 400000 30 265 500000 38 333 600000 44 400 700000 51 469 800000 61 538 900000 68 607 1000000 75 673 1100000 86 733 1200000 98 801 1300000 108 871 1400000 113 934 1500000 120 999 1600000 131 1061 1700000 141 1119 1800000 146 1176 1900000 156 1237 Table-IV: Comparison of LEAVES for SENSOR Data) 6.7 COMPARISION OF ALL DIMENSION OF PERFORMANCE TOGETHER FOR SENSOR DATA Fig 6.2: Comparison of Performance for Sensor Data for every dimension together
  • 12. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 2, March – April (2013), © IAEME 470 7. CONCLUSION In this paper, we discussed about theoretical aspects and practical results of Stream Data Mining Classification algorithms with different split criterion. The comparison based on different dataset shows the result analysis. Hoeffding trees with windowing technique spend least amount of time for learning and results in higher accuracy than Gini Index. Memory utilization, Accuracy and CPU Utilization which are crucial factor in Stream Data are practically discussed here in this paper with observation. Classification generates decision tree and tree generated with Split Criterion as Information gain shows that size of tree is also decreased as shown in table along with dramatic change in accuracy and CPU Utilization. REFERENCES [1] Elena ikonomovska,Suzana Loskovska,Dejan Gjorgjevik, “A Survey Of Stream Data Mining” Eight National Conference with International Participation-ETAI2007 [2] S.Muthukrishnan, “Data streams: Algorithms and Applications”.Proceeding of the fourteenth annual ACM-SIAM symposium on discrete algorithms,2003 [3] Mohamed Medhat Gaber, Arkady Zaslavsky and Shonali Krishnaswamy. ]“Mining Data Streams: A Review”, Centre for Distributed Systems and Software Engineering, Monash University900 Dandenong Rd, Caulfield East, VIC3145, Australia [4] P. Domingos and G. Hulten, “A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering”, Proceedings of the Eighteenth International Conference on Machine Learning, 2001, Williamstown, MA, Morgan Kaufmann [5] H. Kargupta, R. Bhargava, K. Liu, M. Powers, P.Blair, S. Bushra, J. Dull, K. Sarkar, M. Klein, M. Vasa, and D. Handy, VEDAS: “A Mobile and Distributed Data Stream Mining System for Real-Time Vehicle Monitoring”, Proceedings of SIAM International Conference on Data Mining, 2004. [6]“Adaptive Parameter-free Learning from Evolving Data Streams”, Albert Bifet and Ricard Gavald`a, Universitat Polit`ecnica de Catalunya, Barcelona, Spain. [7] “Mining Stream with Concept Drift”, Dariusz Brzezinski, Master’s thesis, Poznan University of Technology [8] R. Manickam, D. Boominath and V. Bhuvaneswari, “An Analysis of Data Mining: Past, Present and Future”, International journal of Computer Engineering & Technology (IJCET), Volume 3, Issue 1, 2012, pp. 1 - 9, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375 [9] Mr. M. Karthikeyan, Mr. M. Suriya Kumar and Dr. S. Karthikeyan, “A Literature Review on the Data Mining And Information Security”, International journal of Computer Engineering & Technology (IJCET), Volume 3, Issue 1, 2012, pp. 141 - 146, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375