5. Natural Neural Networks
• Signals “move” via electrochemical
signals
• The synapses release a chemical
transmitter – the sum of which can
cause a threshold to be reached –
causing the neuron to “fire”
• Synapses can be inhibitory or excitatory
6. Natural Neural Networks
• We are born with about 100 billion
neurons
• A neuron may connect to as many as
100,000 other neurons
7. Natural Neural Networks
• McCulloch & Pitts (1943) are generally
recognised as the designers of the first
neural network
• Many of their ideas still used today e.g.
– many simple units, “neurons” combine to
give increased computational power
– the idea of a threshold
8. G5G51IAI1IAI Neural NetworksNeural Networks
Modelling a Neuron
• aj :Activation value of unit j
• wj,i :Weight on link from unit j to unit i
• ini :Weighted sum of inputs to unit i
• ai :Activation value of unit i
• g :Activation function
∑= j
jiji aWin ,
9. G5G51IAI1IAI Neural NetworksNeural Networks
Activation Functions
• Stept(x) = 1 if x ≥ t, else 0 threshold=t
• Sign(x) = +1 if x ≥ 0, else –1
• Sigmoid(x) = 1/(1+e-x
)
10. Building a Neural Network
1. “Select Structure”: Design the way that the
neurons are interconnected
2. “Select weights” – decide the strengths with
which the neurons are interconnected
– weights are selected so get a “good match”
to a “training set”
– “training set”: set of inputs and desired
outputs
– often use a “learning algorithm”
11. Neural Networks
• Hebb (1949) developed the first
learning rule
– on the premise that if two neurons were
active at the same time the strength
between them should be increased
12. Neural Networks
• During the 50’s and 60’s many researchers worked,
amidst great excitement, on a particular net structure
called the “perceptron”.
• Minsky & Papert (1969) demonstrated a strong limit
on the power of perceptrons
– saw the death of neural network research for about 15 years
• Only in the mid 80’s (Parker and LeCun) was interest
revived because of their learning algorithm for a
better design of net
– (in fact Werbos discovered algorithm in 1974)
13. Basic Neural Networks
• Will first look at simplest networks
• “Feed-forward”
– Signals travel in one direction through net
– Net computes a function of the inputs
14. G5G51IAI1IAI Neural NetworksNeural Networks
The First Neural Neural Networks
Neurons in a McCulloch-Pitts network are
connected by directed, weighted paths
-1
2
2X1
X2
X3
Y
15. G5G51IAI1IAI Neural NetworksNeural Networks
The First Neural Neural Networks
If the weight on a path is positive the path is
excitatory, otherwise it is inhibitory
-1
2
2X1
X2
X3
Y
16. G5G51IAI1IAI Neural NetworksNeural Networks
The First Neural Neural Networks
The activation of a neuron is binary. That is,
the neuron either fires (activation of one) or
does not fire (activation of zero).
-1
2
2X1
X2
X3
Y
17. G5G51IAI1IAI Neural NetworksNeural Networks
The First Neural Neural Networks
For the network shown here the activation
function for unit Y is
f(y_in) = 1, if y_in >= θ else 0
where y_in is the total input signal received
θ is the threshold for Y
-1
2
2X1
X2
X3
Y
18. G5G51IAI1IAI Neural NetworksNeural Networks
The First Neural Neural Networks
Originally, all excitatory connections into a
particular neuron have the same weight,
although different weighted connections can
be input to different neurons
Later weights allowed to be arbitrary
-1
2
2X1
X2
X3
Y
19. G5G51IAI1IAI Neural NetworksNeural Networks
The First Neural Neural Networks
Each neuron has a fixed threshold. If the net
input into the neuron is greater than or equal
to the threshold, the neuron fires
-1
2
2X1
X2
X3
Y
20. G5G51IAI1IAI Neural NetworksNeural Networks
The First Neural Neural Networks
The threshold is set such that any non-zero
inhibitory input will prevent the neuron from
firing
-1
2
2X1
X2
X3
Y
21. Building Logic Gates
• Computers are built out of “logic gates”
• Can we use neural nets to represent logical
functions?
• Use threshold (step) function for activation
function
– all activation values are 0 (false) or 1 (true)
22. G5G51IAI1IAI Neural NetworksNeural Networks
The First Neural Neural Networks
AND Function
1
1X1
X2
Y
AND
X1 X2 Y
1 1 1
1 0 0
0 1 0
0 0 0
Threshold(Y) = 2
23. G5G51IAI1IAI Neural NetworksNeural Networks
The First Neural Neural Networks
AND FunctionOR Function
2
2X1
X2
Y
OR
X1 X2 Y
1 1 1
1 0 1
0 1 1
0 0 0
Threshold(Y) = 2
24. G5G51IAI1IAI Neural NetworksNeural Networks
The First Neural Neural Networks
AND NOT Function
-1
2X1
X2
Y
AND
NOT
X1 X2 Y
1 1 0
1 0 1
0 1 0
0 0 0
Threshold(Y) = 2
27. G5G51IAI1IAI Neural NetworksNeural Networks
Perceptron
• Synonym for Single-
Layer, Feed-Forward
Network
• First Studied in the
50’s
• Other networks were
known about but the
perceptron was the
only one capable of
learning and thus all
research was
concentrated in this
area
28. G5G51IAI1IAI Neural NetworksNeural Networks
Perceptron
• A single weight only
affects one output so
we can restrict our
investigations to a
model as shown on
the right
• Notation can be
simpler, i.e.
∑= j
WjIjStepO 0
30. G5G51IAI1IAI Neural NetworksNeural Networks
What can perceptrons represent?
0,0
0,1
1,0
1,1
0,0
0,1
1,0
1,1
AND XOR
• Functions which can be separated in this way are called
Linearly Separable
• Only linearly separable functions can be represented by a
perceptron
• XOR cannot be represented by a perceptron
31. G5G51IAI1IAI Neural NetworksNeural Networks
What can perceptrons represent?
Linear Separability is also possible in more than 3 dimensions –
but it is harder to visualise
32. XOR
• XOR is not “linearly separable”
– Cannot be represented by a perceptron
• What can we do instead?
1. Convert to logic gates that can be
represented by perceptrons
2. Chain together the gates
• Make sure you understand the following
– check it using truth tables
X1 XOR X2 = (X1 AND NOT X2) OR (X2 AND NOT X1)
33. G5G51IAI1IAI Neural NetworksNeural Networks
The First Neural Neural Networks
XOR Function
2
2
2
2
-1
-1
Z1
Z2
Y
X1
X2
XOR
X1 X2 Y
1 1 0
1 0 1
0 1 1
0 0 0
X1 XOR X2 = (X1 AND NOT X2) OR (X2 AND NOT X1)
34. Single- vs. Multiple-Layers
• Once we chain together the gates then we have
“hidden layers”
– layers that are “hidden” from the output lines
• Have just seen that hidden layers allow us to
represent XOR
– Perceptron is single-layer
– Multiple layers increase the representational
power, so e.g. can represent XOR
• Generally useful nets have multiple-layers
– typically 2-4 layers
35. Expectations
• Be able to explain the terminology used, e.g.
– activation functions
– step and threshold functions
– perceptron
– feed-forward
– multi-layer, hidden layers
– linear separability
• XOR
– why perceptrons cannot cope with XOR
– how XOR is possible with hidden layers