Neural networks 1

G51IAI
Introduction to AI
Andrew Parkes
Neural Networks 1

Neural Networks
• AIMA
– Section 20.5 of 2003 edition
• Fundamentals of Neural Networks :
Architectures, Algorithms and
Applications. L, Fausett, 1994
• An Introduction to Neural Networks (2nd
Ed). Morton, IM, 1995

Brief History
• Try to create artificial intelligence based
on the natural intelligence we know:
• The brain
– massively interconnected neurons

G5G51IAI1IAI Neural NetworksNeural Networks
Neural Networks

Natural Neural Networks
• Signals “move” via electrochemical
signals
• The synapses release a chemical
transmitter – the sum of which can
cause a threshold to be reached –
causing the neuron to “fire”
• Synapses can be inhibitory or excitatory

• We are born with about 100 billion
neurons
• A neuron may connect to as many as
100,000 other neurons

• McCulloch & Pitts (1943) are generally
recognised as the designers of the first
neural network
• Many of their ideas still used today e.g.
– many simple units, “neurons” combine to
give increased computational power
– the idea of a threshold

Modelling a Neuron
• aj :Activation value of unit j
• wj,i :Weight on link from unit j to unit i
• ini :Weighted sum of inputs to unit i
• ai :Activation value of unit i
• g :Activation function
∑= j
jiji aWin ,

Activation Functions
• Stept(x) = 1 if x ≥ t, else 0 threshold=t
• Sign(x) = +1 if x ≥ 0, else –1
• Sigmoid(x) = 1/(1+e-x
)

Building a Neural Network
1. “Select Structure”: Design the way that the
neurons are interconnected
2. “Select weights” – decide the strengths with
which the neurons are interconnected
– weights are selected so get a “good match”
to a “training set”
– “training set”: set of inputs and desired
outputs
– often use a “learning algorithm”

Neural Networks
• Hebb (1949) developed the first
learning rule
– on the premise that if two neurons were
active at the same time the strength
between them should be increased

Neural Networks
• During the 50’s and 60’s many researchers worked,
amidst great excitement, on a particular net structure
called the “perceptron”.
• Minsky & Papert (1969) demonstrated a strong limit
on the power of perceptrons
– saw the death of neural network research for about 15 years
• Only in the mid 80’s (Parker and LeCun) was interest
revived because of their learning algorithm for a
better design of net
– (in fact Werbos discovered algorithm in 1974)

Basic Neural Networks
• Will first look at simplest networks
• “Feed-forward”
– Signals travel in one direction through net
– Net computes a function of the inputs

The First Neural Neural Networks
Neurons in a McCulloch-Pitts network are
connected by directed, weighted paths
-1
2
2X1
X2
X3
Y

If the weight on a path is positive the path is
excitatory, otherwise it is inhibitory
-1
2
2X1
X2
X3
Y

The activation of a neuron is binary. That is,
the neuron either fires (activation of one) or
does not fire (activation of zero).
-1
2
2X1
X2
X3
Y

For the network shown here the activation
function for unit Y is
f(y_in) = 1, if y_in >= θ else 0
where y_in is the total input signal received
θ is the threshold for Y
-1
2
2X1
X2
X3
Y

Originally, all excitatory connections into a
particular neuron have the same weight,
although different weighted connections can
be input to different neurons
Later weights allowed to be arbitrary
-1
2
2X1
X2
X3
Y

Each neuron has a fixed threshold. If the net
input into the neuron is greater than or equal
to the threshold, the neuron fires
-1
2
2X1
X2
X3
Y

The threshold is set such that any non-zero
inhibitory input will prevent the neuron from
firing
-1
2
2X1
X2
X3
Y

Building Logic Gates
• Computers are built out of “logic gates”
• Can we use neural nets to represent logical
functions?
• Use threshold (step) function for activation
function
– all activation values are 0 (false) or 1 (true)

AND Function
1
1X1
X2
Y
AND
X1 X2 Y
1 1 1
1 0 0
0 1 0
0 0 0
Threshold(Y) = 2

AND FunctionOR Function
2
2X1
X2
Y
OR
X1 X2 Y
1 1 1
1 0 1
0 1 1
0 0 0
Threshold(Y) = 2

AND NOT Function
-1
2X1
X2
Y
AND
NOT
X1 X2 Y
1 1 0
1 0 1
0 1 0
0 0 0
Threshold(Y) = 2

Simple Networks
AND OR NOT
Input 1 0 0 1 1 0 0 1 1 0 1
Input 2 0 1 0 1 0 1 0 1
Output 0 0 0 1 0 1 1 1 1 0

Simple Networks
t = 0.0
y
x
W = 1.5
W = 1
-1

Perceptron
• Synonym for Single-
Layer, Feed-Forward
Network
• First Studied in the
50’s
• Other networks were
known about but the
perceptron was the
only one capable of
learning and thus all
research was
concentrated in this
area

Perceptron
• A single weight only
affects one output so
we can restrict our
investigations to a
model as shown on
the right
• Notation can be
simpler, i.e.
∑= j
WjIjStepO 0

What can perceptrons represent?
AND XOR
Input 1 0 0 1 1 0 0 1 1
Input 2 0 1 0 1 0 1 0 1
Output 0 0 0 1 0 1 1 0

0,0
0,1
1,0
1,1
0,0
0,1
1,0
1,1
AND XOR
• Functions which can be separated in this way are called
Linearly Separable
• Only linearly separable functions can be represented by a
perceptron
• XOR cannot be represented by a perceptron

Linear Separability is also possible in more than 3 dimensions –
but it is harder to visualise

XOR
• XOR is not “linearly separable”
– Cannot be represented by a perceptron
• What can we do instead?
1. Convert to logic gates that can be
represented by perceptrons
2. Chain together the gates
• Make sure you understand the following
– check it using truth tables
X1 XOR X2 = (X1 AND NOT X2) OR (X2 AND NOT X1)

XOR Function
2
2
2
2
-1
-1
Z1
Z2
Y
X1
X2
XOR
X1 X2 Y
1 1 0
1 0 1
0 1 1
0 0 0
X1 XOR X2 = (X1 AND NOT X2) OR (X2 AND NOT X1)

Single- vs. Multiple-Layers
• Once we chain together the gates then we have
“hidden layers”
– layers that are “hidden” from the output lines
• Have just seen that hidden layers allow us to
represent XOR
– Perceptron is single-layer
– Multiple layers increase the representational
power, so e.g. can represent XOR
• Generally useful nets have multiple-layers
– typically 2-4 layers

Expectations
• Be able to explain the terminology used, e.g.
– activation functions
– step and threshold functions
– perceptron
– feed-forward
– multi-layer, hidden layers
– linear separability
• XOR
– why perceptrons cannot cope with XOR
– how XOR is possible with hidden layers

Neural networks 1

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Neural networks 1

Semelhante a Neural networks 1 (20)

Último

Último (20)

Neural networks 1