The document discusses backpropagation, an algorithm used to train neural networks. It begins with background on perceptron learning and the need for an algorithm that can train multilayer perceptrons to perform nonlinear classification. It then describes the development of backpropagation, from early work in the 1970s to its popularization in the 1980s. The document provides examples of using backpropagation to design networks for binary classification and multi-class problems. It also outlines the generalized mathematical expressions and steps involved in backpropagation, including calculating the error derivative with respect to weights and updating weights to minimize loss.
2. Background
โข Perceptron Learning Algorithm , Hebbian Learning can classify input pattern if input
patterns are linearly separable.
โข We need an algorithm which can train multilayer of perceptron or classify patterns
which are not linearly separable.
โข Algorithm should also be able to use non-linear activation function.
๐๐
๐๐
๐ช๐๐๐๐ ๐
๐ช๐๐๐๐ ๐
๐ณ๐๐๐๐๐๐๐ ๐บ๐๐๐๐๐๐๐๐
๐๐
๐๐
๐ช๐๐๐๐ ๐
๐ช๐๐๐๐ ๐
๐ณ๐๐๐๐๐๐๐ ๐ต๐๐ โ ๐๐๐๐๐๐๐๐๐
โข Need non-linear boundaries
โข Perceptron algorithm can't be used
โข Variation of GD rule is used.
โข More layers are required
โข Non-linear activation function required
Perceptron Algorithm โ
๐พ๐+๐
๐ต๐๐
= ๐พ๐
๐ต๐๐
+ (๐ โ ๐)๐๐
Gradient Descent Algorithm-
๐พ๐+๐
๐ต๐๐
= ๐พ๐
๐๐๐
โ ๐ผ
๐๐ณ
๐๐๐
๐ณ๐๐๐ ๐๐๐๐๐๐๐๐, ๐ณ =
๐
๐
๐ โ ๐ ๐
3. Background- Back Propagation
โข The perceptron learning rule of Frank Rosenblatt and the LMS algorithm of Bernard Widrow and
Marcian Hoff were designed to train single-layer perceptron-like networks.
โข Single-layer networks suffer from the disadvantage that they are only able to solve linearly separable
classification problems. Both Rosenblatt and Widrow were aware of these limitations and proposed
multilayer networks that could overcome them, but they were not able to generalize their algorithms
to train these more powerful networks.
โข First description of an algorithm to train multilayer networks was contained in the thesis of Paul
Werbos in 1974 .This thesis presented the algorithm in the context of general networks, with neural
networks as a special case, and was not disseminated in the neural network community.
โข It was not until the mid 1980s that the backpropagation algorithm was rediscovered and widely
publicized. It was rediscovered independently by David Rumelhart, Geoffrey Hinton and Ronald
Williams 1986, David Parker 1985 and Yann Le Cun 1985.
โข The algorithm was popularized by its inclusion in the book Parallel Distributed Processing [RuMc86],
which described the work of the Parallel Distributed Processing Group led by psychologists David
Rumelhart and James Mc-Clelland
โข The multilayer perceptron, trained by the backpropagation algorithm, is currently the most widely
used neural network.
4. Network Design
Problem : Whether you watch a movie or not ?
Step 1 : Design โ Output can be Yes (1) or No (0). Therefore one neuron or perceptron is
sufficient.
Step -2 : Choose suitable activation function in the output along with a rule to update the
weights. (Hard Limit function for perceptron learning algorithm, sigmoid for the
Widrow-Hoff rule or delta rule.)
๐พ๐+๐
๐ต๐๐
= ๐พ๐
๐๐๐
โ ๐ผ
๐๐ณ
๐๐๐
๐ณ =
๐
๐
๐ โ เท
๐ ๐
=
๐
๐
๐ โ ๐ ๐๐ + ๐
๐
๐๐ณ
๐๐
= ๐ โ
๐
๐
๐ โ ๐ ๐๐ + ๐
๐๐ ๐๐ + ๐
๐๐
= โ ๐ โ เท
๐ ๐โฒ ๐๐ + ๐ ๐
๐ฅ เท ๐
Director
or
Actor
or
Genre
or
IMDB
w
Yes (1)
or
No (0)
๐ค๐ฅ + ๐
เท
๐ = ๐ ๐๐ + ๐ =
๐
๐ + ๐โ๐๐+๐
๐ ๐๐ + ๐
๐ค0 = ๐
1
5. Network Design
Problem : Sort the students in the 4 house based on their three qualities like lineage,
choice and ethics ?
Step 1 : Design โ Here, the input vector is 3-D i.e for each students, ๐๐ก๐ข๐๐๐๐ก 1 =
๐ฟ1
๐ถ1
๐ธ1
,
๐๐ก๐ข๐๐๐๐ก 2 =
๐ฟ2
๐ถ2
๐ธ2
๐๐
๐๐
๐๐
N
๐๐
๐๐
๐๐
๐๐=1
๐๐
๐๐
๐๐
๐๐ = ๐
Yes (1)
or
No (0)
๐1
๐2
เท
๐๐ = ๐ ๐๐๐๐๐ + ๐๐๐๐๐ + ๐๐๐๐๐ + ๐๐
๐๐
๐๐
๐๐
๐๐๐
๐๐๐
๐๐๐
๐๐๐
๐๐๐
๐๐๐
เท
๐๐ = ๐ ๐๐๐๐๐ + ๐๐๐๐๐ + ๐๐๐๐๐ + ๐๐
๐๐
๐๐
เท
๐ฆ1
เท
๐ฆ2
0
1
1
0
1
1
0
0
A B C D
Houses
๐ฆ1
๐ฆ2
Actual Output
Target Output