1. Neural Networks NN 1 1
Neural Networks
Teacher:
Elena
Marchiori
R4.47
elena@cs.vu.nl
Assistant:
Marius Codrea
S4.16
mcodrea@few.vu.nl
2. Neural Networks NN 1 2
Course Outline
The course is divided in two parts: theory and
practice.
1. Theory covers basic topics in neural
networks theory and application to
supervised and unsupervised learning.
2. Practice deals with basics of Matlab and
application of NN learning algorithms.
3. Neural Networks NN 1 3
Course Information
• Register for practicum: send email to
mcodrea@few.vu.nl with:
1. Subject: NN practicum
2. Content: names, study numbers, study directions
(AI,BWI,I, other)
• Course information, plan, slides and links to
on-line material are available at
http://www.cs.vu.nl/~elena/nn.html
4. Neural Networks NN 1 4
Course Evaluation
• Course value: 6 ects
• Evaluation is based on the following two
parts:
– theory (weight 0.5): final exam at the end of the
course consisting of questions about theory part.
(Dates to be announced)
– practicum(weight 0.5): Matlab programming
assignments to be done in couples (Available
during the course at
http://www.few.vu.nl/~codrea/nn)
5. Neural Networks NN 1 5
What are Neural Networks?
• Simple computational elements forming a
large network
– Emphasis on learning (pattern recognition)
– Local computation (neurons)
• Definition of NNs is vague
– Often | but not always | inspired by biological brain
6. Neural Networks NN 1 6
History
• Roots of work on NN are in:
• Neurobiological studies (more than one century ago):
• How do nerves behave when stimulated by different magnitudes
of electric current? Is there a minimal threshold needed for
nerves to be activated? Given that no single nerve cel is long
enough, how do different nerve cells communicate among each
other?
• Psychological studies:
• How do animals learn, forget, recognize and perform other types
of tasks?
• Psycho-physical experiments helped to understand how individual
neurons and groups of neurons work.
• McCulloch and Pitts introduced the first mathematical model of
single neuron, widely applied in subsequent work.
7. Neural Networks NN 1 7
History
Prehistory:
• Golgi and Ramon y Cajal study the nervous system and discover
neurons (end of 19th century)
History (brief):
• McCulloch and Pitts (1943): the first artificial neural network with
binary neurons
• Hebb (1949): learning = neurons that are together wire together
• Minsky (1954): neural networks for reinforcement learning
• Taylor (1956): associative memory
• Rosenblatt (1958): perceptron, a single neuron for supervised
learning
8. Neural Networks NN 1 8
History
• Widrow and Hoff (1960): Adaline
• Minsky and Papert (1969): limitations of single-layer perceptrons (and
they erroneously claimed that the limitations hold for multi-layer
perceptrons)
Stagnation in the 70's:
• Individual researchers continue laying foundations
• von der Marlsburg (1973): competitive learning and self-organization
Big neural-nets boom in the 80's
• Grossberg: adaptive resonance theory (ART)
• Hopfield: Hopfield network
• Kohonen: self-organising map (SOM)
9. Neural Networks NN 1 9
History
• Oja: neural principal component analysis (PCA)
• Ackley, Hinton and Sejnowski: Boltzmann machine
• Rumelhart, Hinton and Williams: backpropagation
Diversification during the 90's:
• Machine learning: mathematical rigor, Bayesian methods,
infomation theory, support vector machines (now state of the
art!), ...
• Computational neurosciences: workings of most subsystems of the
brain are understood at some level; research ranges from low-level
compartmental models of individual neurons to large-scale brain
models
10. Neural Networks NN 1 10
Course Topics
Learning Tasks
Supervised Unsupervised
Data:
Labeled examples
(input , desired output)
Tasks:
classification
pattern recognition
regression
NN models:
perceptron
adaline
feed-forward NN
radial basis function
support vector machines
Data:
Unlabeled examples
(different realizations of the
input)
Tasks:
clustering
content addressable memory
NN models:
self-organizing maps (SOM)
Hopfield networks
11. Neural Networks NN 1 11
NNs: goal and design
– Knowledge about the learning task is given in the
form of a set of examples (dataset) called training
examples.
– A NN is specified by:
• an architecture: a set of neurons and links connecting
neurons. Each link has a weight,
• a neuron model: the information processing unit of the
NN,
• a learning algorithm: used for training the NN by
modifying the weights in order to solve the particular
learning task correctly on the training examples.
The aim is to obtain a NN that generalizes well, that
is, that behaves correctly on new examples of the
learning task.
12. Neural Networks NN 1 12
Example: Alvinn
Autonomous
driving at 70 mph
on a public
highway
Camera
image
30x32 pixels
as inputs
30 outputs
for steering 30x32 weights
into one out of
four hidden
unit
4 hidden
units
13. Neural Networks NN 1 13
Face Recognition
90% accurate learning head pose, and recognizing 1-of-20 faces
15. Neural Networks NN 1 15
Dimensions of a Neural
Network
• network architectures
• types of neurons
• learning algorithms
• applications
16. Neural Networks NN 1 16
Network architectures
• Three different classes of network architectures
– single-layer feed-forward neurons are organized
– multi-layer feed-forward in acyclic layers
– recurrent
• The architecture of a neural network is linked with the
learning algorithm used to train
17. Neural Networks NN 1 17
Single Layer Feed-forward
Input layer
of
source nodes
Output layer
of
neurons
18. Neural Networks NN 1 18
Multi layer feed-forward
Input
layer
Output
layer
Hidden Layer
3-4-2 Network
19. Neural Networks NN 1 19
Recurrent Network with hidden neuron: unit delay
operator z-1
is used to model a dynamic system
z-1
z-1
z-1
Recurrent network
input
hidden
output
20. Neural Networks NN 1 20
The Neuron
Input
values
weights
Summing
function
Bias
b
Activation
functionLocal
Field
v
Output
y
x1
x2
xm
w2
wm
w1
∑ )(−ϕ
………….
21. Neural Networks NN 1 21
Input Signal and Weights
Input signals
An input may be either a
raw / preprocessed signal or
image. Alternatively, some
specific features can also be
used.
If specific features are used
as input, their number and
selection is crucial and
application dependent
Weights
Weights are connected
between an input and a
summing node. These affect to
the summing operation.
The quality of network can be
seen from weights
Bias is a constant input with
certain weight.
Usually the weights are
randomized in the beginning
22. Neural Networks NN 1 22
The Neuron
• The neuron is the basic information processing unit of
a NN. It consists of:
1 A set of links, describing the neuron inputs, with
weights W1, W2, …, Wm
2 An adder function (linear combiner) for computing
the weighted sum of
the inputs (real numbers):
3 Activation function (squashing function) for
limiting the amplitude of the neuron output.
∑=
=
m
1
jjxwu
j
ϕ
)(uy b+= ϕ
23. Neural Networks NN 1 23
Bias of a Neuron
• The bias b has the effect of applying an affine
transformation to the weighted sum u
v = u + b
• v is called induced field of the neuron
x2x1u −=x1-x2=0
x1-x2= 1
x1
x2
x1-x2= -1
24. Neural Networks NN 1 24
Bias as extra input
Input
signal
Synaptic
weights
Summing
function
Activation
functionLocal
Field
v
Output
y
x1
x2
xm
w2
wm
w1
∑ )(−ϕ
w0
x0 = +1
• The bias is an external parameter of the neuron. It can be
modeled by adding an extra input.
bw
xwv j
m
j
j
=
= ∑=
0
0
…………..
25. Neural Networks NN 1 25
Activation Function
There are different activation functions used in different applications. The
most common ones are:
Hard-limiter Piecewise linear Sigmoid Hyperbolic tangent
( )
<
≥
=
00
01
vif
vif
vϕ ( )
−≤
−≥≥
≥
=
210
2121
211
vif
vifv
vif
vϕ ( )
)exp(1
1
av
v
−+
=ϕ
( ) ( )vv tanh=ϕ
ϕ
26. Neural Networks NN 1 26
Neuron Models
• The choice of determines the neuron model. Examples:
• step function:
• ramp function:
• sigmoid function:
with z,x,y parameters
• Gaussian function:
ϕ
−
−=
2
2
1
exp
2
1
)(
σ
µ
σπ
ϕ
v
v
)exp(1
1
)(
yxv
zv
+−+
+=ϕ
−−−+
>
<
=
otherwise))/())(((
if
if
)(
cdabcva
dvb
cva
vϕ
>
<
=
cvb
cva
v
if
if
)(ϕ
27. Neural Networks NN 1 27
Learning Algorithms
Depend on the network architecture:
• Error correcting learning (perceptron)
• Delta rule (AdaLine, Backprop)
• Competitive Learning (Self Organizing Maps)
28. Neural Networks NN 1 28
Applications
• Classification:
– Image recognition
– Speech recognition
– Diagnostic
– Fraud detection
– …
• Regression:
– Forecasting (prediction on base of past history)
– …
• Pattern association:
– Retrieve an image from corrupted one
– …
• Clustering:
– clients profiles
– disease subtypes
– …
29. Neural Networks NN 1 29
Supervised learning
Non-linear classifiers
Linear classifiers
Perceptron Adaline
Feed-forward networksRadial basis function networks
Support vector machines
Unsupervised learning
Clustering
Content addressable memories
Optimization Hopfield networks
Self-organizing maps K-means
30. Neural Networks NN 1 30
Vectors: basics
(slides by Hyoungjune Yi: aster@cs.umd.edu)
• Ordered set of
numbers: (1,2,3,4)
• Example: (x,y,z)
coordinates of pt in
space. runit vectoais,1If
1
2
),,
2
,
1
(
vv
n
i
i
xv
n
xxxv
=
=
=
=
∑
33. Neural Networks NN 1 33
Operations on vectors
• sum
• max, min, mean, sort, …
• Pointwise: .^
34. Neural Networks NN 1 34
Inner (dot) Product
vv
ww
αα
22112121 .),).(,(. yxyxyyxxwv +==
The inner product is aThe inner product is a SCALAR!SCALAR!
αcos||||||||),).(,(. 2121 wvyyxxwv ⋅==
wvwv ⊥⇔= 0.
35. Neural Networks NN 1 35
Matrices
=×
nmnn
m
m
m
mn
aaa
aaa
aaa
aaa
A
21
33231
22221
11211
mnmnmn BAC ××× +=
Sum:Sum:
ijijij bac +=
A and B must have the sameA and B must have the same
dimensionsdimensions
36. Neural Networks NN 1 36
Matrices
pmmnpn BAC ××× =
Product:Product:
∑=
=
m
k
kjikij bac
1
A and B must haveA and B must have
compatible dimensionscompatible dimensions
nnnnnnnn ABBA ×××× ≠
Identity Matrix:
AAIIAI ==
=
100
010
001
37. Neural Networks NN 1 37
Matrices
mn
T
nm AC ×× =
Transpose:Transpose:
jiij ac = TTT
ABAB =)(
TTT
BABA +=+ )(
IfIf AAT
= A is symmetricA is symmetric
38. Neural Networks NN 1 38
Matrices
IAAAA nnnnnnnn == ××
−
×
−
×
11
Inverse:Inverse: A must be squareA must be square
−
−
−
=
−
1121
1222
12212211
1
2221
1211 1
aa
aa
aaaaaa
aa
40. Neural Networks NN 1 40
2D Translation Equation
PP
xx
yy
ttxx
ttyy
P’P’
tt
tPP +=++= ),(' yx tytx
),(
),(
yx tt
yx
=
=
t
P
41. Neural Networks NN 1 41
2D Translation using Matrices
PP
xx
yy
ttxx
ttyy
P’P’
tt
),(
),(
yx tt
yx
=
=
t
P
⋅
=
+
+
→
1
1
0
0
1
' y
x
t
t
ty
tx
y
x
y
x
P
tt PP