SlideShare a Scribd company logo
1 of 6
Download to read offline
Module
          12
Machine Learning
        Version 2 CSE IIT, Kharagpur
Lesson
              38
Neural Networks - II
           Version 2 CSE IIT, Kharagpur
12.4.3 Perceptron

Definition: It’s a step function based on a linear combination of real-valued inputs. If the
combination is above a threshold it outputs a 1, otherwise it outputs a –1.

 x1           w1
 x2
            w2             Σ                                                  {1 or –1}

            wn          w0
 xn
                   X0=1


 O(x1,x2,…,xn) = 1 if w0 + w1x1 + w2x2 + … + wnxn > 0

                            -1 otherwise

A perceptron draws a hyperplane as the decision boundary over the (n-dimensional) input
space.


                                              Decision boundary (WX = 0)
         + +
          +

                                  -       -
                                      -



A perceptron can learn only examples that are called “linearly separable”. These are
examples that can be perfectly separated by a hyperplane.



                                                            Version 2 CSE IIT, Kharagpur
+ +                                             +
      +                                              +
                                                 -                       -
                     -       -                                  +
                         -                                          -

     Linearly separable                              Non-linearly separable

Perceptrons can learn many boolean functions: AND, OR, NAND, NOR, but not XOR

However, every boolean function can be represented with a perceptron network that has
two levels of depth or more.

The weights of a perceptron implementing the AND function is shown below.


 AND:                x1
                                       W1=0.5

                                                         Σ
                                 W2=0.5

                      x2                              W0 = -0.8
                                              X0=1

12.4.3.1 Perceptron Learning

Learning a perceptron means finding the right values for W. The hypothesis space of a
perceptron is the space of all weight vectors. The perceptron learning algorithm can be
stated as below.

                                                         Version 2 CSE IIT, Kharagpur
1. Assign random values to the weight vector
   2. Apply the weight update rule to every training example
   3. Are all training examples correctly classified?
         a. Yes. Quit
         b. No. Go back to Step 2.


There are two popular weight update rules.
       i) The perceptron rule, and
       ii) Delta rule

The Perceptron Rule

For a new training example X = (x1, x2, …, xn), update each weight according to this
rule:

wi = wi + Δwi

Where Δwi = η (t-o) xi
t: target output
o: output generated by the perceptron
η: constant called the learning rate (e.g., 0.1)
Comments about the perceptron training rule:

   •   If the example is correctly classified the term (t-o) equals zero, and no update on
       the weight is necessary.
   •   If the perceptron outputs –1 and the real answer is 1, the weight is increased.
   •   If the perceptron outputs a 1 and the real answer is -1, the weight is decreased.
   •   Provided the examples are linearly separable and a small value for η is used, the
       rule is proved to classify all training examples correctly (i.e, is consistent with the
       training data).

The Delta Rule

What happens if the examples are not linearly separable?
To address this situation we try to approximate the real concept using the delta rule.

The key idea is to use a gradient descent search. We will try to minimize the following
error:

E = ½ Σi (ti – oi) 2

where the sum goes over all training examples. Here oi is the inner product WX and not
sgn(WX) as with the perceptron rule. The idea is to find a minimum in the space of
weights and the error function E.

                                                             Version 2 CSE IIT, Kharagpur
The delta rule is as follows:

For a new training example X = (x1, x2, …, xn), update each weight according to this
rule:

wi = wi + Δwi

Where Δwi = -η E’(W)/wi

η: learning rate (e.g., 0.1)

It is easy to see that

E’(W)/ wi = Σi (ti – oi) (-xi)

So that gives us the following equation:

 wi = η Σi (ti – oi) xi

There are two differences between the perceptron and the delta rule. The perceptron is
based on an output from a step function, whereas the delta rule uses the linear
combination of inputs directly. The perceptron is guaranteed to converge to a consistent
hypothesis assuming the data is linearly separable. The delta rules converges in the limit
but it does not need the condition of linearly separable data.

There are two main difficulties with the gradient descent method:

    1.   Convergence to a minimum may take a long time.

    2.   There is no guarantee we will find the global minimum.

These are handled by using momentum terms and random perturbations to the weight
vectors.




                                                           Version 2 CSE IIT, Kharagpur

More Related Content

What's hot

Section5 Rbf
Section5 RbfSection5 Rbf
Section5 Rbf
kylin
 
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
butest
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
nextlib
 
Convolution Neural Networks
Convolution Neural NetworksConvolution Neural Networks
Convolution Neural Networks
AhmedMahany
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
ESCOM
 

What's hot (20)

Fuzzy logic member functions
Fuzzy logic member functionsFuzzy logic member functions
Fuzzy logic member functions
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing maps
 
Section5 Rbf
Section5 RbfSection5 Rbf
Section5 Rbf
 
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
CC282 Unsupervised Learning (Clustering) Lecture 7 slides for ...
 
PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...
PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...
PPT - Adaptive Quantitative Trading : An Imitative Deep Reinforcement Learnin...
 
Perceptron in ANN
Perceptron in ANNPerceptron in ANN
Perceptron in ANN
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3
 
Vc dimension in Machine Learning
Vc dimension in Machine LearningVc dimension in Machine Learning
Vc dimension in Machine Learning
 
Convolution Neural Networks
Convolution Neural NetworksConvolution Neural Networks
Convolution Neural Networks
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
Lecture 06 marco aurelio ranzato - deep learning
Lecture 06   marco aurelio ranzato - deep learningLecture 06   marco aurelio ranzato - deep learning
Lecture 06 marco aurelio ranzato - deep learning
 
Predicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman networkPredicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman network
 
Svm V SVC
Svm V SVCSvm V SVC
Svm V SVC
 
Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)Neural Networks: Radial Bases Functions (RBF)
Neural Networks: Radial Bases Functions (RBF)
 
Review_Cibe Sridharan
Review_Cibe SridharanReview_Cibe Sridharan
Review_Cibe Sridharan
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion mining
 

Viewers also liked (7)

AI Lesson 37
AI Lesson 37AI Lesson 37
AI Lesson 37
 
AI Lesson 39
AI Lesson 39AI Lesson 39
AI Lesson 39
 
AI Lesson 36
AI Lesson 36AI Lesson 36
AI Lesson 36
 
AI Lesson 40
AI Lesson 40AI Lesson 40
AI Lesson 40
 
AI Lesson 03
AI Lesson 03AI Lesson 03
AI Lesson 03
 
AI Lesson 41
AI Lesson 41AI Lesson 41
AI Lesson 41
 
Ai ch2
Ai ch2Ai ch2
Ai ch2
 

Similar to AI Lesson 38

latest TYPES OF NEURAL NETWORKS (2).pptx
latest TYPES OF NEURAL NETWORKS (2).pptxlatest TYPES OF NEURAL NETWORKS (2).pptx
latest TYPES OF NEURAL NETWORKS (2).pptx
MdMahfoozAlam5
 
SOFT COMPUTERING TECHNICS -Unit 1
SOFT COMPUTERING TECHNICS -Unit 1SOFT COMPUTERING TECHNICS -Unit 1
SOFT COMPUTERING TECHNICS -Unit 1
sravanthi computers
 
20200428135045cfbc718e2c.pdf
20200428135045cfbc718e2c.pdf20200428135045cfbc718e2c.pdf
20200428135045cfbc718e2c.pdf
TitleTube
 
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.pptcs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
GayathriRHICETCSESTA
 

Similar to AI Lesson 38 (20)

latest TYPES OF NEURAL NETWORKS (2).pptx
latest TYPES OF NEURAL NETWORKS (2).pptxlatest TYPES OF NEURAL NETWORKS (2).pptx
latest TYPES OF NEURAL NETWORKS (2).pptx
 
tutorial.ppt
tutorial.ppttutorial.ppt
tutorial.ppt
 
Perceptron Study Material with XOR example
Perceptron Study Material with XOR examplePerceptron Study Material with XOR example
Perceptron Study Material with XOR example
 
Artificial Neural Networks (ANNs) focusing on the perceptron Algorithm.pptx
Artificial Neural Networks (ANNs) focusing on the perceptron Algorithm.pptxArtificial Neural Networks (ANNs) focusing on the perceptron Algorithm.pptx
Artificial Neural Networks (ANNs) focusing on the perceptron Algorithm.pptx
 
Perceptron
PerceptronPerceptron
Perceptron
 
SOFT COMPUTERING TECHNICS -Unit 1
SOFT COMPUTERING TECHNICS -Unit 1SOFT COMPUTERING TECHNICS -Unit 1
SOFT COMPUTERING TECHNICS -Unit 1
 
20200428135045cfbc718e2c.pdf
20200428135045cfbc718e2c.pdf20200428135045cfbc718e2c.pdf
20200428135045cfbc718e2c.pdf
 
10-Perceptron.pdf
10-Perceptron.pdf10-Perceptron.pdf
10-Perceptron.pdf
 
8 neural network representation
8 neural network representation8 neural network representation
8 neural network representation
 
CS767_Lecture_04.pptx
CS767_Lecture_04.pptxCS767_Lecture_04.pptx
CS767_Lecture_04.pptx
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
The Perceptron and its Learning Rule
The Perceptron and its Learning RuleThe Perceptron and its Learning Rule
The Perceptron and its Learning Rule
 
Artificial neural networks - A gentle introduction to ANNS.pptx
Artificial neural networks - A gentle introduction to ANNS.pptxArtificial neural networks - A gentle introduction to ANNS.pptx
Artificial neural networks - A gentle introduction to ANNS.pptx
 
Lec10
Lec10Lec10
Lec10
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
 
03 Single layer Perception Classifier
03 Single layer Perception Classifier03 Single layer Perception Classifier
03 Single layer Perception Classifier
 
feedforward-network-
feedforward-network-feedforward-network-
feedforward-network-
 
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.pptcs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
 
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.pptcs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
 
SOFTCOMPUTERING TECHNICS - Unit
SOFTCOMPUTERING TECHNICS - UnitSOFTCOMPUTERING TECHNICS - Unit
SOFTCOMPUTERING TECHNICS - Unit
 

More from Assistant Professor (20)

AI Lesson 35
AI Lesson 35AI Lesson 35
AI Lesson 35
 
AI Lesson 34
AI Lesson 34AI Lesson 34
AI Lesson 34
 
AI Lesson 33
AI Lesson 33AI Lesson 33
AI Lesson 33
 
AI Lesson 32
AI Lesson 32AI Lesson 32
AI Lesson 32
 
AI Lesson 31
AI Lesson 31AI Lesson 31
AI Lesson 31
 
AI Lesson 30
AI Lesson 30AI Lesson 30
AI Lesson 30
 
AI Lesson 29
AI Lesson 29AI Lesson 29
AI Lesson 29
 
AI Lesson 28
AI Lesson 28AI Lesson 28
AI Lesson 28
 
AI Lesson 27
AI Lesson 27AI Lesson 27
AI Lesson 27
 
AI Lesson 26
AI Lesson 26AI Lesson 26
AI Lesson 26
 
AI Lesson 25
AI Lesson 25AI Lesson 25
AI Lesson 25
 
AI Lesson 24
AI Lesson 24AI Lesson 24
AI Lesson 24
 
AI Lesson 23
AI Lesson 23AI Lesson 23
AI Lesson 23
 
AI Lesson 22
AI Lesson 22AI Lesson 22
AI Lesson 22
 
AI Lesson 21
AI Lesson 21AI Lesson 21
AI Lesson 21
 
Lesson 20
Lesson 20Lesson 20
Lesson 20
 
AI Lesson 19
AI Lesson 19AI Lesson 19
AI Lesson 19
 
AI Lesson 18
AI Lesson 18AI Lesson 18
AI Lesson 18
 
AI Lesson 17
AI Lesson 17AI Lesson 17
AI Lesson 17
 
AI Lesson 16
AI Lesson 16AI Lesson 16
AI Lesson 16
 

Recently uploaded

IATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdffIATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdff
17thcssbs2
 
The basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptxThe basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptx
heathfieldcps1
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
中 央社
 
Liberal & Redical Feminism presentation.pptx
Liberal & Redical Feminism presentation.pptxLiberal & Redical Feminism presentation.pptx
Liberal & Redical Feminism presentation.pptx
Rizwan Abbas
 

Recently uploaded (20)

IATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdffIATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdff
 
2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx
 
The basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptxThe basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptx
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
 
Salient features of Environment protection Act 1986.pptx
Salient features of Environment protection Act 1986.pptxSalient features of Environment protection Act 1986.pptx
Salient features of Environment protection Act 1986.pptx
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT VẬT LÝ 2024 - TỪ CÁC TRƯỜNG, TRƯ...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT VẬT LÝ 2024 - TỪ CÁC TRƯỜNG, TRƯ...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT VẬT LÝ 2024 - TỪ CÁC TRƯỜNG, TRƯ...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT VẬT LÝ 2024 - TỪ CÁC TRƯỜNG, TRƯ...
 
Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17
 
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdfPost Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
 
factors influencing drug absorption-final-2.pptx
factors influencing drug absorption-final-2.pptxfactors influencing drug absorption-final-2.pptx
factors influencing drug absorption-final-2.pptx
 
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxslides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
 
Pragya Champions Chalice 2024 Prelims & Finals Q/A set, General Quiz
Pragya Champions Chalice 2024 Prelims & Finals Q/A set, General QuizPragya Champions Chalice 2024 Prelims & Finals Q/A set, General Quiz
Pragya Champions Chalice 2024 Prelims & Finals Q/A set, General Quiz
 
Morse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptxMorse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptx
 
Mbaye_Astou.Education Civica_Human Rights.pptx
Mbaye_Astou.Education Civica_Human Rights.pptxMbaye_Astou.Education Civica_Human Rights.pptx
Mbaye_Astou.Education Civica_Human Rights.pptx
 
How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17
 
[GDSC YCCE] Build with AI Online Presentation
[GDSC YCCE] Build with AI Online Presentation[GDSC YCCE] Build with AI Online Presentation
[GDSC YCCE] Build with AI Online Presentation
 
Keeping Your Information Safe with Centralized Security Services
Keeping Your Information Safe with Centralized Security ServicesKeeping Your Information Safe with Centralized Security Services
Keeping Your Information Safe with Centralized Security Services
 
An Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptxAn Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptx
 
Post Exam Fun(da) Intra UEM General Quiz - Finals.pdf
Post Exam Fun(da) Intra UEM General Quiz - Finals.pdfPost Exam Fun(da) Intra UEM General Quiz - Finals.pdf
Post Exam Fun(da) Intra UEM General Quiz - Finals.pdf
 
Capitol Tech Univ Doctoral Presentation -May 2024
Capitol Tech Univ Doctoral Presentation -May 2024Capitol Tech Univ Doctoral Presentation -May 2024
Capitol Tech Univ Doctoral Presentation -May 2024
 
Liberal & Redical Feminism presentation.pptx
Liberal & Redical Feminism presentation.pptxLiberal & Redical Feminism presentation.pptx
Liberal & Redical Feminism presentation.pptx
 

AI Lesson 38

  • 1. Module 12 Machine Learning Version 2 CSE IIT, Kharagpur
  • 2. Lesson 38 Neural Networks - II Version 2 CSE IIT, Kharagpur
  • 3. 12.4.3 Perceptron Definition: It’s a step function based on a linear combination of real-valued inputs. If the combination is above a threshold it outputs a 1, otherwise it outputs a –1. x1 w1 x2 w2 Σ {1 or –1} wn w0 xn X0=1 O(x1,x2,…,xn) = 1 if w0 + w1x1 + w2x2 + … + wnxn > 0 -1 otherwise A perceptron draws a hyperplane as the decision boundary over the (n-dimensional) input space. Decision boundary (WX = 0) + + + - - - A perceptron can learn only examples that are called “linearly separable”. These are examples that can be perfectly separated by a hyperplane. Version 2 CSE IIT, Kharagpur
  • 4. + + + + + - - - - + - - Linearly separable Non-linearly separable Perceptrons can learn many boolean functions: AND, OR, NAND, NOR, but not XOR However, every boolean function can be represented with a perceptron network that has two levels of depth or more. The weights of a perceptron implementing the AND function is shown below. AND: x1 W1=0.5 Σ W2=0.5 x2 W0 = -0.8 X0=1 12.4.3.1 Perceptron Learning Learning a perceptron means finding the right values for W. The hypothesis space of a perceptron is the space of all weight vectors. The perceptron learning algorithm can be stated as below. Version 2 CSE IIT, Kharagpur
  • 5. 1. Assign random values to the weight vector 2. Apply the weight update rule to every training example 3. Are all training examples correctly classified? a. Yes. Quit b. No. Go back to Step 2. There are two popular weight update rules. i) The perceptron rule, and ii) Delta rule The Perceptron Rule For a new training example X = (x1, x2, …, xn), update each weight according to this rule: wi = wi + Δwi Where Δwi = η (t-o) xi t: target output o: output generated by the perceptron η: constant called the learning rate (e.g., 0.1) Comments about the perceptron training rule: • If the example is correctly classified the term (t-o) equals zero, and no update on the weight is necessary. • If the perceptron outputs –1 and the real answer is 1, the weight is increased. • If the perceptron outputs a 1 and the real answer is -1, the weight is decreased. • Provided the examples are linearly separable and a small value for η is used, the rule is proved to classify all training examples correctly (i.e, is consistent with the training data). The Delta Rule What happens if the examples are not linearly separable? To address this situation we try to approximate the real concept using the delta rule. The key idea is to use a gradient descent search. We will try to minimize the following error: E = ½ Σi (ti – oi) 2 where the sum goes over all training examples. Here oi is the inner product WX and not sgn(WX) as with the perceptron rule. The idea is to find a minimum in the space of weights and the error function E. Version 2 CSE IIT, Kharagpur
  • 6. The delta rule is as follows: For a new training example X = (x1, x2, …, xn), update each weight according to this rule: wi = wi + Δwi Where Δwi = -η E’(W)/wi η: learning rate (e.g., 0.1) It is easy to see that E’(W)/ wi = Σi (ti – oi) (-xi) So that gives us the following equation: wi = η Σi (ti – oi) xi There are two differences between the perceptron and the delta rule. The perceptron is based on an output from a step function, whereas the delta rule uses the linear combination of inputs directly. The perceptron is guaranteed to converge to a consistent hypothesis assuming the data is linearly separable. The delta rules converges in the limit but it does not need the condition of linearly separable data. There are two main difficulties with the gradient descent method: 1. Convergence to a minimum may take a long time. 2. There is no guarantee we will find the global minimum. These are handled by using momentum terms and random perturbations to the weight vectors. Version 2 CSE IIT, Kharagpur