ai7.ppt

Machine Learning
Neural Networks
Slides mostly adapted from Tom
Mithcell, Han and Kamber
Artificial Neural Networks
 Computational models inspired by the human
brain:
 Algorithms that try to mimic the brain.
 Massively parallel, distributed system, made up of
simple processing units (neurons)
 Synaptic connection strengths among neurons are
used to store the acquired knowledge.
 Knowledge is acquired by the network from its
environment through a learning process
History
 late-1800's - Neural Networks appear as an
analogy to biological systems
 1960's and 70's – Simple neural networks appear
 Fall out of favor because the perceptron is not
effective by itself, and there were no good algorithms
for multilayer nets
 1986 – Backpropagation algorithm appears
 Neural Networks have a resurgence in popularity
 More computationally expensive
Applications of ANNs
 ANNs have been widely used in various domains
for:
 Pattern recognition
 Function approximation
 Associative memory
Properties
 Inputs are flexible
 any real values
 Highly correlated or independent
 Target function may be discrete-valued, real-valued, or
vectors of discrete or real values
 Outputs are real numbers between 0 and 1
 Resistant to errors in the training data
 Long training time
 Fast evaluation
 The function produced can be difficult for humans to
interpret
When to consider neural networks
 Input is high-dimensional discrete or raw-valued
 Output is discrete or real-valued
 Output is a vector of values
 Possibly noisy data
 Form of target function is unknown
 Human readability of the result is not important
Examples:
 Speech phoneme recognition
 Image classification
 Financial prediction
December 10, 2022 Data Mining: Concepts and Techniques 7
A Neuron (= a perceptron)
 The n-dimensional input vector x is mapped into variable y by
means of the scalar product and a nonlinear function mapping
t
-
f
weighted
sum
Input
vector x
output y
Activation
function
weight
vector w

w0
w1
wn
x0
x1
xn
)
sign(
y
e
For Exampl
n
0
i
t
x
w i
i 
 

Perceptron
 Basic unit in a neural network
 Linear separator
 Parts
 N inputs, x1 ... xn
 Weights for each input, w1 ... wn
 A bias input x0 (constant) and associated weight w0
 Weighted sum of inputs, y = w0x0 + w1x1 + ... + wnxn
 A threshold function or activation function,
 i.e 1 if y > t, -1 if y <= t
Artificial Neural Networks (ANN)
 Model is an assembly of
inter-connected nodes
and weighted links
 Output node sums up
each of its input value
according to the weights
of its links
 Compare output node
against some threshold t

X1
X2
X3
Y
Black box
w1
t
Output
node
Input
nodes
w2
w3
)
( t
x
w
I
Y
i
i
i 
 
Perceptron Model
)
( t
x
w
sign
Y
i
i
i 
 
or
Types of connectivity
 Feedforward networks
 These compute a series of
transformations
 Typically, the first layer is the input
and the last layer is the output.
 Recurrent networks
 These have directed cycles in their
connection graph. They can have
complicated dynamics.
 More biologically realistic.
hidden units
output units
input units
Different Network Topologies
 Single layer feed-forward networks
 Input layer projecting into the output layer
Input Output
layer layer
Single layer
network
Different Network Topologies
 Multi-layer feed-forward networks
 One or more hidden layers. Input projects only from
previous layers onto a layer.
Input Hidden Output
layer layer layer
2-layer or
1-hidden layer
fully connected
network
Different Network Topologies
 Multi-layer feed-forward networks
Input Hidden Output
layer layers layer
Different Network Topologies
 Recurrent networks
 A network with feedback, where some of its inputs
are connected to some of its outputs (discrete time).
Input Output
layer layer
Recurrent
network
Algorithm for learning ANN
 Initialize the weights (w0, w1, …, wk)
 Adjust the weights in such a way that the output
of ANN is consistent with class labels of training
examples
 Error function:
 Find the weights wi’s that minimize the above error
function
 e.g., gradient descent, backpropagation algorithm
 2
)
,
(
 

i
i
i
i X
w
f
Y
E
Optimizing concave/convex function
 Maximum of a concave function = minimum of a
convex function
Gradient ascent (concave) / Gradient descent (convex)
Gradient ascent rule
ai7.ppt
ai7.ppt
ai7.ppt
ai7.ppt
ai7.ppt
ai7.ppt
ai7.ppt
Decision surface of a perceptron
 Decision surface is a hyperplane
 Can capture linearly separable classes
 Non-linearly separable
 Use a network of them
ai7.ppt
ai7.ppt
Multi-layer Networks
 Linear units inappropriate
 No more expressive than a single layer
 „Introduce non-linearity
 Threshold not differentiable
 „Use sigmoid function
ai7.ppt
ai7.ppt
ai7.ppt
December 10, 2022 Data Mining: Concepts and Techniques 31
Backpropagation
 Iteratively process a set of training tuples & compare the network's
prediction with the actual known target value
 For each training tuple, the weights are modified to minimize the mean
squared error between the network's prediction and the actual target
value
 Modifications are made in the “backwards” direction: from the output
layer, through each hidden layer down to the first hidden layer, hence
“backpropagation”
 Steps
 Initialize weights (to small random #s) and biases in the network
 Propagate the inputs forward (by applying activation function)
 Backpropagate the error (by updating weights and biases)
 Terminating condition (when error is very small, etc.)
ai7.ppt
December 10, 2022 Data Mining: Concepts and Techniques 33
How A Multi-Layer Neural Network Works?
 The inputs to the network correspond to the attributes measured for
each training tuple
 Inputs are fed simultaneously into the units making up the input layer
 They are then weighted and fed simultaneously to a hidden layer
 The number of hidden layers is arbitrary, although usually only one
 The weighted outputs of the last hidden layer are input to units making
up the output layer, which emits the network's prediction
 The network is feed-forward in that none of the weights cycles back to
an input unit or to an output unit of a previous layer
 From a statistical point of view, networks perform nonlinear regression:
Given enough hidden units and enough training samples, they can
closely approximate any function
December 10, 2022 Data Mining: Concepts and Techniques 34
Defining a Network Topology
 First decide the network topology: # of units in the input
layer, # of hidden layers (if > 1), # of units in each hidden
layer, and # of units in the output layer
 Normalizing the input values for each attribute measured in
the training tuples to [0.0—1.0]
 One input unit per domain value, each initialized to 0
 Output, if for classification and more than two classes, one
output unit per class is used
 Once a network has been trained and its accuracy is
unacceptable, repeat the training process with a different
network topology or a different set of initial weights
December 10, 2022 Data Mining: Concepts and Techniques 35
Backpropagation and Interpretability
 Efficiency of backpropagation: Each epoch (one interation through the
training set) takes O(|D| * w), with |D| tuples and w weights, but # of
epochs can be exponential to n, the number of inputs, in the worst case
 Rule extraction from networks: network pruning
 Simplify the network structure by removing weighted links that have the
least effect on the trained network
 Then perform link, unit, or activation value clustering
 The set of input and activation values are studied to derive rules
describing the relationship between the input and hidden unit layers
 Sensitivity analysis: assess the impact that a given input variable has on a
network output. The knowledge gained from this analysis can be
represented in rules
December 10, 2022 Data Mining: Concepts and Techniques 36
Neural Network as a Classifier
 Weakness
 Long training time
 Require a number of parameters typically best determined empirically,
e.g., the network topology or “structure.”
 Poor interpretability: Difficult to interpret the symbolic meaning behind
the learned weights and of “hidden units” in the network
 Strength
 High tolerance to noisy data
 Ability to classify untrained patterns
 Well-suited for continuous-valued inputs and outputs
 Successful on a wide array of real-world data
 Algorithms are inherently parallel
 Techniques have recently been developed for the extraction of rules from
trained neural networks
ai7.ppt
Artificial Neural Networks (ANN)
X1 X2 X3 Y
1 0 0 0
1 0 1 1
1 1 0 1
1 1 1 1
0 0 1 0
0 1 0 0
0 1 1 1
0 0 0 0

X1
X2
X3
Y
Black box
0.3
0.3
0.3 t=0.4
Output
node
Input
nodes









otherwise
0
true
is
if
1
)
(
where
)
0
4
.
0
3
.
0
3
.
0
3
.
0
( 3
2
1
z
z
I
X
X
X
I
Y
Learning Perceptrons
December 10, 2022 Data Mining: Concepts and Techniques 40
A Multi-Layer Feed-Forward Neural Network
Output layer
Input layer
Hidden layer
Output vector
Input vector: X
wij
ij
k
i
i
k
j
k
j x
y
y
w
w )
ˆ
( )
(
)
(
)
1
(





General Structure of ANN
Activation
function
g(Si )
Si
Oi
I1
I2
I3
wi1
wi2
wi3
Oi
Neuron i
Input Output
threshold, t
Input
Layer
Hidden
Layer
Output
Layer
x1 x2 x3 x4 x5
y
Training ANN means learning
the weights of the neurons
1 de 41

Recomendados

ai7.ppt por
ai7.pptai7.ppt
ai7.pptMrHacker61
5 visualizações41 slides
MNN por
MNNMNN
MNNInternational Islamic University
638 visualizações60 slides
2.5 backpropagation por
2.5 backpropagation2.5 backpropagation
2.5 backpropagationKrish_ver2
24.7K visualizações20 slides
Multi Layer Network por
Multi Layer NetworkMulti Layer Network
Multi Layer NetworkInternational Islamic University
4.5K visualizações65 slides
Artificial neural networks por
Artificial neural networks Artificial neural networks
Artificial neural networks ShwethaShreeS
122 visualizações24 slides
Classification by backpropacation por
Classification by backpropacationClassification by backpropacation
Classification by backpropacationSiva Priya
20 visualizações13 slides

Mais conteúdo relacionado

Similar a ai7.ppt

Chapter 09 class advanced por
Chapter 09 class advancedChapter 09 class advanced
Chapter 09 class advancedHouw Liong The
438 visualizações83 slides
09 classadvanced por
09 classadvanced09 classadvanced
09 classadvancedJoonyoungJayGwak
66 visualizações78 slides
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D... por
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...cscpconf
149 visualizações18 slides
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ... por
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...csandit
523 visualizações18 slides
Artificial neural network by arpit_sharma por
Artificial neural network by arpit_sharmaArtificial neural network by arpit_sharma
Artificial neural network by arpit_sharmaEr. Arpit Sharma
217 visualizações21 slides
Chapter 9. Classification Advanced Methods.ppt por
Chapter 9. Classification Advanced Methods.pptChapter 9. Classification Advanced Methods.ppt
Chapter 9. Classification Advanced Methods.pptSubrata Kumer Paul
57 visualizações78 slides

Similar a ai7.ppt(20)

Chapter 09 class advanced por Houw Liong The
Chapter 09 class advancedChapter 09 class advanced
Chapter 09 class advanced
Houw Liong The438 visualizações
09 classadvanced por JoonyoungJayGwak
09 classadvanced09 classadvanced
09 classadvanced
JoonyoungJayGwak66 visualizações
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D... por cscpconf
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
cscpconf149 visualizações
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ... por csandit
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
csandit523 visualizações
Artificial neural network by arpit_sharma por Er. Arpit Sharma
Artificial neural network by arpit_sharmaArtificial neural network by arpit_sharma
Artificial neural network by arpit_sharma
Er. Arpit Sharma217 visualizações
Chapter 9. Classification Advanced Methods.ppt por Subrata Kumer Paul
Chapter 9. Classification Advanced Methods.pptChapter 9. Classification Advanced Methods.ppt
Chapter 9. Classification Advanced Methods.ppt
Subrata Kumer Paul57 visualizações
Neural networks and deep learning por RADO7900
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
RADO790095 visualizações
tutorial.ppt por Vara Prasad
tutorial.ppttutorial.ppt
tutorial.ppt
Vara Prasad17 visualizações
Ann por vini89
Ann Ann
Ann
vini892.1K visualizações
Neural-Networks.ppt por RINUSATHYAN
Neural-Networks.pptNeural-Networks.ppt
Neural-Networks.ppt
RINUSATHYAN11 visualizações
Perceptron por Nagarajan
PerceptronPerceptron
Perceptron
Nagarajan29.5K visualizações
Adaptive equalization por Kamal Bhatt
Adaptive equalizationAdaptive equalization
Adaptive equalization
Kamal Bhatt13.5K visualizações
Web spam classification using supervised artificial neural network algorithms por aciijournal
Web spam classification using supervised artificial neural network algorithmsWeb spam classification using supervised artificial neural network algorithms
Web spam classification using supervised artificial neural network algorithms
aciijournal353 visualizações
Web Spam Classification Using Supervised Artificial Neural Network Algorithms por aciijournal
Web Spam Classification Using Supervised Artificial Neural Network AlgorithmsWeb Spam Classification Using Supervised Artificial Neural Network Algorithms
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
aciijournal3 visualizações
Intro to Deep learning - Autoencoders por Akash Goel
Intro to Deep learning - Autoencoders Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders
Akash Goel5.8K visualizações
machinelearningengineeringslideshare-160909192132 (1).pdf por ShivareddyGangam
machinelearningengineeringslideshare-160909192132 (1).pdfmachinelearningengineeringslideshare-160909192132 (1).pdf
machinelearningengineeringslideshare-160909192132 (1).pdf
ShivareddyGangam5 visualizações
19_Learning.ppt por gnans Kgnanshek
19_Learning.ppt19_Learning.ppt
19_Learning.ppt
gnans Kgnanshek4 visualizações
Perceptron (neural network) por EdutechLearners
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
EdutechLearners21.7K visualizações
ACUMENS ON NEURAL NET AKG 20 7 23.pptx por gnans Kgnanshek
ACUMENS ON NEURAL NET AKG 20 7 23.pptxACUMENS ON NEURAL NET AKG 20 7 23.pptx
ACUMENS ON NEURAL NET AKG 20 7 23.pptx
gnans Kgnanshek2 visualizações
Artificial neural network for machine learning por grinu
Artificial neural network for machine learningArtificial neural network for machine learning
Artificial neural network for machine learning
grinu572 visualizações

Último

HTTP headers that make your website go faster - devs.gent November 2023 por
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023Thijs Feryn
28 visualizações151 slides
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... por
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...ShapeBlue
88 visualizações20 slides
Igniting Next Level Productivity with AI-Infused Data Integration Workflows por
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Safe Software
344 visualizações86 slides
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P... por
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...ShapeBlue
82 visualizações62 slides
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue por
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueMigrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueShapeBlue
96 visualizações20 slides
MVP and prioritization.pdf por
MVP and prioritization.pdfMVP and prioritization.pdf
MVP and prioritization.pdfrahuldharwal141
38 visualizações8 slides

Último(20)

HTTP headers that make your website go faster - devs.gent November 2023 por Thijs Feryn
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023
Thijs Feryn28 visualizações
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... por ShapeBlue
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
ShapeBlue88 visualizações
Igniting Next Level Productivity with AI-Infused Data Integration Workflows por Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software344 visualizações
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P... por ShapeBlue
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
ShapeBlue82 visualizações
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue por ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueMigrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
ShapeBlue96 visualizações
MVP and prioritization.pdf por rahuldharwal141
MVP and prioritization.pdfMVP and prioritization.pdf
MVP and prioritization.pdf
rahuldharwal14138 visualizações
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... por ShapeBlue
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
ShapeBlue74 visualizações
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online por ShapeBlue
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineKVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
ShapeBlue102 visualizações
Business Analyst Series 2023 - Week 3 Session 5 por DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10369 visualizações
Scaling Knowledge Graph Architectures with AI por Enterprise Knowledge
Scaling Knowledge Graph Architectures with AIScaling Knowledge Graph Architectures with AI
Scaling Knowledge Graph Architectures with AI
Enterprise Knowledge53 visualizações
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue por ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlueElevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
ShapeBlue96 visualizações
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... por Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker50 visualizações
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava... por ShapeBlue
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
ShapeBlue48 visualizações
Data Integrity for Banking and Financial Services por Precisely
Data Integrity for Banking and Financial ServicesData Integrity for Banking and Financial Services
Data Integrity for Banking and Financial Services
Precisely56 visualizações
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... por ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue54 visualizações
Why and How CloudStack at weSystems - Stephan Bienek - weSystems por ShapeBlue
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsWhy and How CloudStack at weSystems - Stephan Bienek - weSystems
Why and How CloudStack at weSystems - Stephan Bienek - weSystems
ShapeBlue111 visualizações
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... por ShapeBlue
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
ShapeBlue57 visualizações
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue por ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
ShapeBlue46 visualizações
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ... por ShapeBlue
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
ShapeBlue65 visualizações
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ... por ShapeBlue
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...
ShapeBlue83 visualizações

ai7.ppt

  • 1. Machine Learning Neural Networks Slides mostly adapted from Tom Mithcell, Han and Kamber
  • 2. Artificial Neural Networks  Computational models inspired by the human brain:  Algorithms that try to mimic the brain.  Massively parallel, distributed system, made up of simple processing units (neurons)  Synaptic connection strengths among neurons are used to store the acquired knowledge.  Knowledge is acquired by the network from its environment through a learning process
  • 3. History  late-1800's - Neural Networks appear as an analogy to biological systems  1960's and 70's – Simple neural networks appear  Fall out of favor because the perceptron is not effective by itself, and there were no good algorithms for multilayer nets  1986 – Backpropagation algorithm appears  Neural Networks have a resurgence in popularity  More computationally expensive
  • 4. Applications of ANNs  ANNs have been widely used in various domains for:  Pattern recognition  Function approximation  Associative memory
  • 5. Properties  Inputs are flexible  any real values  Highly correlated or independent  Target function may be discrete-valued, real-valued, or vectors of discrete or real values  Outputs are real numbers between 0 and 1  Resistant to errors in the training data  Long training time  Fast evaluation  The function produced can be difficult for humans to interpret
  • 6. When to consider neural networks  Input is high-dimensional discrete or raw-valued  Output is discrete or real-valued  Output is a vector of values  Possibly noisy data  Form of target function is unknown  Human readability of the result is not important Examples:  Speech phoneme recognition  Image classification  Financial prediction
  • 7. December 10, 2022 Data Mining: Concepts and Techniques 7 A Neuron (= a perceptron)  The n-dimensional input vector x is mapped into variable y by means of the scalar product and a nonlinear function mapping t - f weighted sum Input vector x output y Activation function weight vector w  w0 w1 wn x0 x1 xn ) sign( y e For Exampl n 0 i t x w i i    
  • 8. Perceptron  Basic unit in a neural network  Linear separator  Parts  N inputs, x1 ... xn  Weights for each input, w1 ... wn  A bias input x0 (constant) and associated weight w0  Weighted sum of inputs, y = w0x0 + w1x1 + ... + wnxn  A threshold function or activation function,  i.e 1 if y > t, -1 if y <= t
  • 9. Artificial Neural Networks (ANN)  Model is an assembly of inter-connected nodes and weighted links  Output node sums up each of its input value according to the weights of its links  Compare output node against some threshold t  X1 X2 X3 Y Black box w1 t Output node Input nodes w2 w3 ) ( t x w I Y i i i    Perceptron Model ) ( t x w sign Y i i i    or
  • 10. Types of connectivity  Feedforward networks  These compute a series of transformations  Typically, the first layer is the input and the last layer is the output.  Recurrent networks  These have directed cycles in their connection graph. They can have complicated dynamics.  More biologically realistic. hidden units output units input units
  • 11. Different Network Topologies  Single layer feed-forward networks  Input layer projecting into the output layer Input Output layer layer Single layer network
  • 12. Different Network Topologies  Multi-layer feed-forward networks  One or more hidden layers. Input projects only from previous layers onto a layer. Input Hidden Output layer layer layer 2-layer or 1-hidden layer fully connected network
  • 13. Different Network Topologies  Multi-layer feed-forward networks Input Hidden Output layer layers layer
  • 14. Different Network Topologies  Recurrent networks  A network with feedback, where some of its inputs are connected to some of its outputs (discrete time). Input Output layer layer Recurrent network
  • 15. Algorithm for learning ANN  Initialize the weights (w0, w1, …, wk)  Adjust the weights in such a way that the output of ANN is consistent with class labels of training examples  Error function:  Find the weights wi’s that minimize the above error function  e.g., gradient descent, backpropagation algorithm  2 ) , (    i i i i X w f Y E
  • 16. Optimizing concave/convex function  Maximum of a concave function = minimum of a convex function Gradient ascent (concave) / Gradient descent (convex) Gradient ascent rule
  • 24. Decision surface of a perceptron  Decision surface is a hyperplane  Can capture linearly separable classes  Non-linearly separable  Use a network of them
  • 27. Multi-layer Networks  Linear units inappropriate  No more expressive than a single layer  „Introduce non-linearity  Threshold not differentiable  „Use sigmoid function
  • 31. December 10, 2022 Data Mining: Concepts and Techniques 31 Backpropagation  Iteratively process a set of training tuples & compare the network's prediction with the actual known target value  For each training tuple, the weights are modified to minimize the mean squared error between the network's prediction and the actual target value  Modifications are made in the “backwards” direction: from the output layer, through each hidden layer down to the first hidden layer, hence “backpropagation”  Steps  Initialize weights (to small random #s) and biases in the network  Propagate the inputs forward (by applying activation function)  Backpropagate the error (by updating weights and biases)  Terminating condition (when error is very small, etc.)
  • 33. December 10, 2022 Data Mining: Concepts and Techniques 33 How A Multi-Layer Neural Network Works?  The inputs to the network correspond to the attributes measured for each training tuple  Inputs are fed simultaneously into the units making up the input layer  They are then weighted and fed simultaneously to a hidden layer  The number of hidden layers is arbitrary, although usually only one  The weighted outputs of the last hidden layer are input to units making up the output layer, which emits the network's prediction  The network is feed-forward in that none of the weights cycles back to an input unit or to an output unit of a previous layer  From a statistical point of view, networks perform nonlinear regression: Given enough hidden units and enough training samples, they can closely approximate any function
  • 34. December 10, 2022 Data Mining: Concepts and Techniques 34 Defining a Network Topology  First decide the network topology: # of units in the input layer, # of hidden layers (if > 1), # of units in each hidden layer, and # of units in the output layer  Normalizing the input values for each attribute measured in the training tuples to [0.0—1.0]  One input unit per domain value, each initialized to 0  Output, if for classification and more than two classes, one output unit per class is used  Once a network has been trained and its accuracy is unacceptable, repeat the training process with a different network topology or a different set of initial weights
  • 35. December 10, 2022 Data Mining: Concepts and Techniques 35 Backpropagation and Interpretability  Efficiency of backpropagation: Each epoch (one interation through the training set) takes O(|D| * w), with |D| tuples and w weights, but # of epochs can be exponential to n, the number of inputs, in the worst case  Rule extraction from networks: network pruning  Simplify the network structure by removing weighted links that have the least effect on the trained network  Then perform link, unit, or activation value clustering  The set of input and activation values are studied to derive rules describing the relationship between the input and hidden unit layers  Sensitivity analysis: assess the impact that a given input variable has on a network output. The knowledge gained from this analysis can be represented in rules
  • 36. December 10, 2022 Data Mining: Concepts and Techniques 36 Neural Network as a Classifier  Weakness  Long training time  Require a number of parameters typically best determined empirically, e.g., the network topology or “structure.”  Poor interpretability: Difficult to interpret the symbolic meaning behind the learned weights and of “hidden units” in the network  Strength  High tolerance to noisy data  Ability to classify untrained patterns  Well-suited for continuous-valued inputs and outputs  Successful on a wide array of real-world data  Algorithms are inherently parallel  Techniques have recently been developed for the extraction of rules from trained neural networks
  • 38. Artificial Neural Networks (ANN) X1 X2 X3 Y 1 0 0 0 1 0 1 1 1 1 0 1 1 1 1 1 0 0 1 0 0 1 0 0 0 1 1 1 0 0 0 0  X1 X2 X3 Y Black box 0.3 0.3 0.3 t=0.4 Output node Input nodes          otherwise 0 true is if 1 ) ( where ) 0 4 . 0 3 . 0 3 . 0 3 . 0 ( 3 2 1 z z I X X X I Y
  • 40. December 10, 2022 Data Mining: Concepts and Techniques 40 A Multi-Layer Feed-Forward Neural Network Output layer Input layer Hidden layer Output vector Input vector: X wij ij k i i k j k j x y y w w ) ˆ ( ) ( ) ( ) 1 (     
  • 41. General Structure of ANN Activation function g(Si ) Si Oi I1 I2 I3 wi1 wi2 wi3 Oi Neuron i Input Output threshold, t Input Layer Hidden Layer Output Layer x1 x2 x3 x4 x5 y Training ANN means learning the weights of the neurons