SlideShare a Scribd company logo
1 of 12
Group – 9
Project I (CSE 791)
7th Semester, CSE.
UIT, Burdwan University.
Project Guide:
    Mr. Dipankar Dutta
    (Associate Professor of CSE & IT Department,
     UIT, Burdwan University)

Project Members:
    Puja Mukherjee (20081013)
    Sunandita Chattopadhyay (20081001)
    Rakesh Mukherjee (20081055)
    Bibaswann Bandyopadhyay (20081017)
    Ankita Ghosh (20081051)
Fields Covered:
Neural Network:
    • Self Organizing Maps
    • Learning Vector Quantization
Evolutionary Computation:
    • Particle Swarm Optimization
    • Ant Colony Optimization
    • Simulated Annealing
Clustering: Most important unsupervised learning
problem. Grouping of similar kinds of data. We are
doing clustering of continuous data here using Neural
Network and Evolutionary Computation.




      Continuous Data          Clusters of Data
Self Organizing Map
1. Normalize the input data to fall between -1 and 1
         (using Multiplicative normalization)
2. The normalized input data are fed across the input layer of som
3. Convert the output to a bipolar number
4. Learn rate is initially .99.
5. The winning neuron is chosen according to which produced the largest bipolar
         value
6. A matrix named correction matrix is used to hold the corrections
7. Error is calculated as follows
         For all input neurons:
         a)The difference between the training set and the corresponding weight
            matrix entry are calculated.
         b)the difference is added in the correction matrix for that input neuron
         c)the square of the difference is error value
8. The weights are adjusted as subtractive
         a)for the winning neuron the input weights are multiplied with the
            correction matrix values multiplied by learn rate
9. learning rate is decreased
10. the current error is checked to see if it is the best error so far, if so, the best
         error value is changed
11. Repeat 5 to 10 until the error value decreases continuously for last 50 iterations
Learning Vector Quantization
   1. Load the input dataset.
   2. Set Maximum number of cluster centers = M (no. of ClassLabels)
      Set Minimum number of cluster center = 0 (zero)
   3. FOR an input pattern P
       a. FIND the closest cluster center C from P.
           i. IF not found THEN
               Allocate P as a new cluster center C.
           ii. ELSE
               FIND the distance of the cluster center C from P
               A. IF (distance > THRESHOLD) THEN
                  Allocate P as a new cluster center C.
               B. ELSE
                        i. Attach P with the cluster center C.
                        ii. Calculate new cluster center of cluster C.
       b. REPEAT step (a) for all inputs.
   4. REPEAT step 3 for 100 iterations.
Particle Swarm Optimization
1. Initialize each particle with k random cluster centroids.
2.For t=1 to t_max do
    a. For each particle i do
    b. For each data vector z in the dataset
        i. Calculate the euclidian distance of
         z with all cluster centroids.
       ii. Assign z to the cluster that have
       nearest centroid to z.
      iii. Calculate the fitness function.
    c. Update the global best and local best positions.
    d. Update the cluster centroids according to velocity
       updating and particle position updating formulas of PSO.
Ant Colony Optimization
1. Place every item Xi on a random cell of the grid;
2. Place every ant k on a random cell of the grid unoccupied by ants;
3. iteration_count  1;
4. while iteration_count < maximum_iteration do
5. for i = 1 to no_of_ants do
6.     if unladen ant and cell occupied by item Xi then
7.        compute f(xi) and Ppick-up(Xi);
8.     else
9.       if ant carrying item xi and cell empty then
10.        compute f(Xi) and Pdrop(Xi);
11.         drop item Xi with probability Pdrop(Xi);
12.      end if
13. end if
14. move to a randomly selected neighboring and unoccupied cell;
15. end for
16. t  t+1
17. end while
18. print locations of items
Simulated Annealing
Co-ordinator node algorithm:
 1. Distribute the n random initial solutions to the n nodes and wait.
 2. Upon receiving the first converged result from any of the nodes
     stop simulated annealing on other nodes.

Worker node algorithm:
 1. Accept initial solutions from the co-ordinator.
 2. repeat
    2.1. Execute Simulated annealing for p iterations. Exchange
         partial results among the worker nodes. Accept the best partial
         result.
    2.2. p = p - r* (loop iteration number).
          until (p <= 0).
 3. Execute simulated annealing using the best solution found as the
     initial solution.
 4. Send the converged value to the coordinator
Project report on Data Clustering

More Related Content

What's hot

Data Structures - Lecture 8 - Study Notes
Data Structures - Lecture 8 - Study NotesData Structures - Lecture 8 - Study Notes
Data Structures - Lecture 8 - Study Notes
Haitham El-Ghareeb
 
cis98006
cis98006cis98006
cis98006
perfj
 
WVKULAK13_submission_14
WVKULAK13_submission_14WVKULAK13_submission_14
WVKULAK13_submission_14
Max De Koninck
 

What's hot (20)

Ch10 Recursion
Ch10 RecursionCh10 Recursion
Ch10 Recursion
 
Tensorflow windows installation
Tensorflow windows installationTensorflow windows installation
Tensorflow windows installation
 
Python faster for loop
Python faster for loopPython faster for loop
Python faster for loop
 
Recursion(Advanced data structure)
Recursion(Advanced data structure)Recursion(Advanced data structure)
Recursion(Advanced data structure)
 
Computational Assignment Help
Computational Assignment HelpComputational Assignment Help
Computational Assignment Help
 
Recursion
RecursionRecursion
Recursion
 
Volume Rendering of Unstructured Tetrahedral Grids using Intel / nVidia OpenCL
Volume Rendering of Unstructured Tetrahedral Grids using Intel / nVidia OpenCLVolume Rendering of Unstructured Tetrahedral Grids using Intel / nVidia OpenCL
Volume Rendering of Unstructured Tetrahedral Grids using Intel / nVidia OpenCL
 
Al2ed chapter16
Al2ed chapter16Al2ed chapter16
Al2ed chapter16
 
Time andspacecomplexity
Time andspacecomplexityTime andspacecomplexity
Time andspacecomplexity
 
Pointer
PointerPointer
Pointer
 
Data Structures - Lecture 8 - Study Notes
Data Structures - Lecture 8 - Study NotesData Structures - Lecture 8 - Study Notes
Data Structures - Lecture 8 - Study Notes
 
Big oh Representation Used in Time complexities
Big oh Representation Used in Time complexitiesBig oh Representation Used in Time complexities
Big oh Representation Used in Time complexities
 
cis98006
cis98006cis98006
cis98006
 
Lecture4
Lecture4Lecture4
Lecture4
 
WVKULAK13_submission_14
WVKULAK13_submission_14WVKULAK13_submission_14
WVKULAK13_submission_14
 
Recursive squaring
Recursive squaringRecursive squaring
Recursive squaring
 
Introduction to Machine Learning with TensorFlow
Introduction to Machine Learning with TensorFlowIntroduction to Machine Learning with TensorFlow
Introduction to Machine Learning with TensorFlow
 
ECE 565 FInal Project
ECE 565 FInal ProjectECE 565 FInal Project
ECE 565 FInal Project
 
Lab5: Functions 2
Lab5: Functions 2Lab5: Functions 2
Lab5: Functions 2
 
Rajat Monga at AI Frontiers: Deep Learning with TensorFlow
Rajat Monga at AI Frontiers: Deep Learning with TensorFlowRajat Monga at AI Frontiers: Deep Learning with TensorFlow
Rajat Monga at AI Frontiers: Deep Learning with TensorFlow
 

Similar to Project report on Data Clustering

Amcat automata questions
Amcat   automata questionsAmcat   automata questions
Amcat automata questions
ESWARANM92
 
Hybrid PSO-SA algorithm for training a Neural Network for Classification
Hybrid PSO-SA algorithm for training a Neural Network for ClassificationHybrid PSO-SA algorithm for training a Neural Network for Classification
Hybrid PSO-SA algorithm for training a Neural Network for Classification
IJCSEA Journal
 
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
cscpconf
 
Ch-2 final exam documet compler design elements
Ch-2 final exam documet compler design elementsCh-2 final exam documet compler design elements
Ch-2 final exam documet compler design elements
MAHERMOHAMED27
 

Similar to Project report on Data Clustering (20)

Amcat automata questions
Amcat   automata questionsAmcat   automata questions
Amcat automata questions
 
Amcat automata questions
Amcat   automata questionsAmcat   automata questions
Amcat automata questions
 
Enhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial DatasetEnhance The K Means Algorithm On Spatial Dataset
Enhance The K Means Algorithm On Spatial Dataset
 
Classificationand different algorithm
Classificationand different algorithmClassificationand different algorithm
Classificationand different algorithm
 
Perceptron in ANN
Perceptron in ANNPerceptron in ANN
Perceptron in ANN
 
Algorithm Assignment Help
Algorithm Assignment HelpAlgorithm Assignment Help
Algorithm Assignment Help
 
Lec10
Lec10Lec10
Lec10
 
Network Design Assignment Help
Network Design Assignment HelpNetwork Design Assignment Help
Network Design Assignment Help
 
03 Data Representation
03 Data Representation03 Data Representation
03 Data Representation
 
Hybrid PSO-SA algorithm for training a Neural Network for Classification
Hybrid PSO-SA algorithm for training a Neural Network for ClassificationHybrid PSO-SA algorithm for training a Neural Network for Classification
Hybrid PSO-SA algorithm for training a Neural Network for Classification
 
A novel work for bin packing problem by ant colony optimization
A novel work for bin packing problem by ant colony optimizationA novel work for bin packing problem by ant colony optimization
A novel work for bin packing problem by ant colony optimization
 
611+tutorial
611+tutorial611+tutorial
611+tutorial
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
 
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
 
Nelder Mead Search Algorithm
Nelder Mead Search AlgorithmNelder Mead Search Algorithm
Nelder Mead Search Algorithm
 
DriP PSO- A fast and inexpensive PSO for drifting problem spaces
DriP PSO- A fast and inexpensive PSO for drifting problem spacesDriP PSO- A fast and inexpensive PSO for drifting problem spaces
DriP PSO- A fast and inexpensive PSO for drifting problem spaces
 
Ch-2 final exam documet compler design elements
Ch-2 final exam documet compler design elementsCh-2 final exam documet compler design elements
Ch-2 final exam documet compler design elements
 
AI Lesson 29
AI Lesson 29AI Lesson 29
AI Lesson 29
 
Lesson 29
Lesson 29Lesson 29
Lesson 29
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Project report on Data Clustering

  • 1. Group – 9 Project I (CSE 791) 7th Semester, CSE. UIT, Burdwan University.
  • 2. Project Guide: Mr. Dipankar Dutta (Associate Professor of CSE & IT Department, UIT, Burdwan University) Project Members: Puja Mukherjee (20081013) Sunandita Chattopadhyay (20081001) Rakesh Mukherjee (20081055) Bibaswann Bandyopadhyay (20081017) Ankita Ghosh (20081051)
  • 3. Fields Covered: Neural Network: • Self Organizing Maps • Learning Vector Quantization Evolutionary Computation: • Particle Swarm Optimization • Ant Colony Optimization • Simulated Annealing
  • 4. Clustering: Most important unsupervised learning problem. Grouping of similar kinds of data. We are doing clustering of continuous data here using Neural Network and Evolutionary Computation. Continuous Data Clusters of Data
  • 5.
  • 6. Self Organizing Map 1. Normalize the input data to fall between -1 and 1 (using Multiplicative normalization) 2. The normalized input data are fed across the input layer of som 3. Convert the output to a bipolar number 4. Learn rate is initially .99. 5. The winning neuron is chosen according to which produced the largest bipolar value 6. A matrix named correction matrix is used to hold the corrections 7. Error is calculated as follows For all input neurons: a)The difference between the training set and the corresponding weight matrix entry are calculated. b)the difference is added in the correction matrix for that input neuron c)the square of the difference is error value 8. The weights are adjusted as subtractive a)for the winning neuron the input weights are multiplied with the correction matrix values multiplied by learn rate 9. learning rate is decreased 10. the current error is checked to see if it is the best error so far, if so, the best error value is changed 11. Repeat 5 to 10 until the error value decreases continuously for last 50 iterations
  • 7. Learning Vector Quantization 1. Load the input dataset. 2. Set Maximum number of cluster centers = M (no. of ClassLabels) Set Minimum number of cluster center = 0 (zero) 3. FOR an input pattern P a. FIND the closest cluster center C from P. i. IF not found THEN Allocate P as a new cluster center C. ii. ELSE FIND the distance of the cluster center C from P A. IF (distance > THRESHOLD) THEN Allocate P as a new cluster center C. B. ELSE i. Attach P with the cluster center C. ii. Calculate new cluster center of cluster C. b. REPEAT step (a) for all inputs. 4. REPEAT step 3 for 100 iterations.
  • 8.
  • 9. Particle Swarm Optimization 1. Initialize each particle with k random cluster centroids. 2.For t=1 to t_max do a. For each particle i do b. For each data vector z in the dataset i. Calculate the euclidian distance of z with all cluster centroids. ii. Assign z to the cluster that have nearest centroid to z. iii. Calculate the fitness function. c. Update the global best and local best positions. d. Update the cluster centroids according to velocity updating and particle position updating formulas of PSO.
  • 10. Ant Colony Optimization 1. Place every item Xi on a random cell of the grid; 2. Place every ant k on a random cell of the grid unoccupied by ants; 3. iteration_count  1; 4. while iteration_count < maximum_iteration do 5. for i = 1 to no_of_ants do 6. if unladen ant and cell occupied by item Xi then 7. compute f(xi) and Ppick-up(Xi); 8. else 9. if ant carrying item xi and cell empty then 10. compute f(Xi) and Pdrop(Xi); 11. drop item Xi with probability Pdrop(Xi); 12. end if 13. end if 14. move to a randomly selected neighboring and unoccupied cell; 15. end for 16. t  t+1 17. end while 18. print locations of items
  • 11. Simulated Annealing Co-ordinator node algorithm: 1. Distribute the n random initial solutions to the n nodes and wait. 2. Upon receiving the first converged result from any of the nodes stop simulated annealing on other nodes. Worker node algorithm: 1. Accept initial solutions from the co-ordinator. 2. repeat 2.1. Execute Simulated annealing for p iterations. Exchange partial results among the worker nodes. Accept the best partial result. 2.2. p = p - r* (loop iteration number). until (p <= 0). 3. Execute simulated annealing using the best solution found as the initial solution. 4. Send the converged value to the coordinator