"Agro-Market Prediction by Fuzzy based Neuro-Genetic Algorithm"
1. FINAL SEMSETER PROJECT
Guide: Prof. Lavanya K.
Submitted by:
Tanay Chaudhari (09BCE449)
in association with Mritunjay Kumar
B.TECH – CSE (2009-13)
VIT ,VELLORE
2. To analyze data-sets of statistics dealing with
the agriculture sector
Collection of cost/capital logs for cost
monitoring purposes
Applying hybrid theorems to generate mean
cost values of the commodities
Find the best score among the mean cost
values to realize a distinct prediction value
3. “DataMining: Concepts and Technique” by J.
Han, M. Kamber (2006)
Familiarizes with data mining and machine
learning statistical approaches for the modern
day market data analysis
“Computer Systems that Learn: Classification
and Prediction Methods from Statistics, Neural
Nets, Machine Learning, and Expert Systems” by
S.M. Weiss, C.A. Kulikowski (1991)
Familiarizes and summarizes with concepts of
major neural network systems, principles and
majorly used theorems
4. “An Incremental Approach to Genetic-Algorithms-
Based Classification” by S.U. Guan, F. Zhu; IEEE
Transactions on Systems, Man and Cybernetics - Part
B 35(2), 227–239 (2005)
GA based algorithms defined and described and their
use in analytical principles. Clustering concepts
discussed in detail.
Research on the data warehouse and data mining
techniques applying to decision assistant system” by
Lifeng Hou, Tao Li, Jingjun Shen; IEEE Transactions on
Data Mining and Warehousing (2011)
The dynamic and enormous info in the current
decision-making, the data warehouse technique to
build the structure of a decision assistant system.
5. 1) Genetic Algorithm
A hybrid algorithm search heuristic that
mimics the process of natural evolution for
optimization and search problem solutions
Method followed is inspired by techniques
of natural evolution, viz. – inheritance,
mutation, selection, crossover
Applications – bioinformatics,
phylogenetics, computational science,
engineering, economics, etc.
6. 2) Fuzzy Logic Algorithm
A hybrid algorithm of many valued logic or
probabilistic logic dealing with reasoning
that is approximate rather than fixed and
exact
Allows approximate values and inferences
as well as incomplete or ambiguous data,
instead of solely relying on absolute data
Applications – smart computing, seismology,
etc.
7. 3) Neuro (Neural) Algorithm
It combines the concept of artificial neural
networks(ANNs) and fuzzy logic
Results in ‘hybrid intelligent systems’,
involving the combination of human-like
reasoning with the learning structure of
neural networks
Applications - robotics, data processing,
function approximation, etc.
8. A.k.a ‘Backward propagation of errors’
A common method of training ANNs, in addition
to the 3 main AI techniques
Concept – from a desired output, the network
learns from the main inputs
The standard network is – an input layer,
multiple hidden layers and an output layer
Each network weight is updated with the errors
that are calculated for each layer, until the
termination condition is satisfied for which the
algorithm propagates back the ‘square of the
error’ and adjusts the weight accordingly
Helps in overcoming the drawbacks of classical
GA
9. Some of the tools used in the implementations of the
proposed system, till now:-
Weka Tool: Fuzzy based development tool
GA Fuzzy Clustering Tool: GA based Fuzzy logic
application developer
Weka Clusterer Visualize: Clustering developer of
the stats input
Microsoft’s ClusPrep: Clustering validation tool
Backpropagation Neuronal Network 0.3: For the
neural training set made for prediction
NOTE: Statistics derived from the Agritech Portal
provided by the Tamil Nadu Agricultural Portal from
the year 2007-2011
10. Assembled from several years of analytical
data logs; to be used in statistics
Clustering of data by available data statistics
Applying the Fuzzy Based GA algorithm
Implementing the training sets obtained by
the clustering of data
Re-clustering to improve on clustering of
data, thus improving on the available data
clusters
11. Evaluation of clustering results is also known as
clustering validation
Need –
• To avoid finding patterns in noise
• To compare clustering algorithms
• To compare two sets of clusters
• To compare two clusters
12. Two major types of validation:
Internal Validation
• When a clustering result is evaluated based
on the data that was clustered itself
• To assign the “best score” to the algorithm
that produces clusters with high similarity
within a cluster and low similarity between
clusters
• Drawback Internal criteria - high scores on
an internal measure do not necessarily
result in effective information retrieval
applications
13. External Validation
• clustering results are evaluated based on data
that was not used for clustering
• Basis on – class labels, external benchmarks, etc.
which are created by human experts
• Considered as “gold standard” for evaluation
14. Determining the clustering tendency of a set of
data, i.e. - distinguishing whether non-random
structure actually exists in the data.
Comparing the results of a cluster analysis to
externally known results, e.g. - to externally
given class labels.
Evaluating how well the results of a cluster
analysis fit the data without reference to
external information.
Comparing the results of two different sets of
cluster
Analyses to determine which is better.
Determining the ‘correct’ number of clusters.
15. Post cluster validation stage, when another
training set is constructed
This set is fed to Bacpropagation algorithm
function, to improve on the errors and thus
on mean value
The best score of the mean value is
considered as the “final prediction value” of
the set
16. The final prediction value is the calculated
value of the price of a commodity
Based on the initial data collected, this value
is quite accurate
Subject to vary depending upon the change
in attributes
May vary depending upon the amount of data
fed initially
17. The prediction value is subject to vary each
time when a new attributed value is entered
The extremely error-minimized could be high
accuracy but not of absolute accuracy
It’s upto administration to implement the
moderated prediction value price
The room for enhancements is high based on
the grade and the quality of tools employed
to generate the prediction value
The outcome of the system could be
enhanced by only using attribute suitable
data and refined tabulation