SlideShare uma empresa Scribd logo
1 de 33
2022/9/6
1
Deep Neural Networks
(DNN)
References
• Tan, Steinbach, Karpatne, Kumar, Introduction to Data Mining, 2e, 2019.
• 鄭羽熙,人工智慧關鍵技術初探,2019.8.26
• Jinwoo Shin, Deep Learning for Optimization, Communication and Networks, The
East Asian School of Information Theory & Communication (EASITC), August 6-10,
2018
• https://speech.ee.ntu.edu.tw/~hylee/ml/2021-spring.html
2
2022/9/6
Outline
• Deep Neural Networks (DNN)-fully connected
• Training DNN
• Recipe for learning
• Ways to prevent overfitting
3
Deep Neural Networks (DNN)
• aka or related to Deep Learning, Deep Belief Networks
• DNN has more than 1 (hidden) layer between input and output
• The series of layers of DNN do feature identification and processing in a series of
stages, just as our brains seem to.
4
2022/9/6
• Inspired by biological neuron
• Many different types
• Dendrites (枝狀) can perform complex non-linear computations
• Synapses (突觸 neuron-to-neuron) are not a single weight but a complex non-linear
dynamical system
DNN : Biological neuron
5
DNN : Artificial neuron
6
2022/9/6
Linear regression ─ fitting a model to
data
7
Linear regression ─ fitting a model to
data
Loss function:
𝐸 =
1
2
(𝑡 − 𝑦 )
t: target
8
2022/9/6
The loss function
Loss function example for regression:
Mean square error (MSE) between labeled output t
(training data, correct answers) and output y
𝐸 =
1
2
(𝑡 − 𝑦 )
9
• Loss function example for classification:
𝐿𝑜𝑠𝑠 𝑊, 𝑏
=
1
𝐾
− 𝑌 𝑖 ln 𝑌 𝑖 + 1 − 𝑌 𝑖 ln 1 − 𝑌 (𝑖)
10
2022/9/6
Loss function
MSE
(Mean-
Square Error)
MAE
(Mean-Absolute
Error)
CE
(Categorical
Cross Entropy)
BCE
(Binary Cross
Entropy)
advantage
Faster
convergence
Not so sensitive to
outliers
More than two
classes
Two classes
disadvantage
Sensitive to
outliers (離群
值)
Slower convergence
(smaller slope)
applications Regression problem Classification problem
11
Minimizing the loss
12
2022/9/6
• Least Mean Square (LMS) also in the opposite direction from the gradient with step
size μ (similar to learning rate μ in AI on p.27) to approach the minimum
13
Conventional (non-AI) way to minimize
the loss
It’s a neural network
14
2022/9/6
Multiple regression
j
15
Computing the gradient
(mini-batch p>1, weak law of large number,
i.i.d.)
16
2022/9/6
Computing the gradient
(Chain rule in Calculus)
o: next layer i: previous layer
17
• Training data : 2-dimentional data
• Labels : 0 or 1
• Task : Binary classification
Training DNN : A simple
example
A dataset
Fields
1.4 2.7
3.8 3.4
6.4 2.8
4.1 0.1
etc …
class
0
1
1
0
Hidden
𝑊1 𝑊2
https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt
Inputs Outputs
2022/9/6
• Initialize with random weights 𝑊) and 𝑊*
Training DNN : A simple
example
A dataset
Fields
1.4 2.7
3.8 3.4
6.4 2.8
4.1 0.1
etc …
class
0
1
1
0
Hidden
𝑊1 𝑊2
https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt
Inputs Outputs
• Input training pattern 1
Training DNN : A simple
example
A dataset
Fields class
1.4 2.7 0
1
1
0
3.8 3.4
6.4 2.8
4.1 0.1
etc …
Hidden
𝑊1 𝑊2
https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt
Inputs Outputs
1.4
2.7
2022/9/6
• Forward neural network to get output
Training DNN : A simple
example
A dataset
Fields class
1.4 2.7 0
1
1
0
3.8 3.4
6.4 2.8
4.1 0.1
etc …
Hidden
𝑊1 𝑊2
Inputs Outputs
1.4
2.7
0.8
https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt
• Compare with target output
Training DNN : A simple
example
A dataset
Fields class
1.4 2.7 0
3.8 3.4 1
6.4 2.8 1
4.1 0.1
etc …
0
Hidden
𝑊1 𝑊2
https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt
Inputs Outputs
1.4
2.7
0.8
0
Error : 0.8
2022/9/6
• Adjust weights based on the error signal
• Back propagation
Training DNN : A simple
example
A dataset
Fields class
1.4 2.7 0
3.8 3.4 1
6.4 2.8 1
4.1 0.1
etc …
0
Hidden
𝑊1 𝑊2
Inputs Outputs
1.4
2.7
0.8
0
Error : 0.8
https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt
• Input training pattern 2
Training DNN : A simple
example
A dataset
Fields
1.4 2.7
class
0
3.8 3.4 1
1
0
6.4 2.8
4.1 0.1
etc …
Hidden
𝑊1 𝑊2
https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt
Inputs Outputs
3.8
3.4
2022/9/6
• Input training pattern 2
• Repeat this process with random training samples
• Making a slight weight adjustment in a direction to reduce the error
Training DNN : A simple
example
A dataset
Fields
1.4 2.7
class
0
3.8 3.4 1
1
0
6.4 2.8
4.1 0.1
etc …
Hidden
𝑊1 𝑊2
https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt
Inputs Outputs
3.8
3.4
Backpropagation
/Testing
26
2022/9/6
Summary (μ: learning rate like LMS step
size)
27
Result from p44
1. Draw a batch of training samples x and corresponding targets y
2. Run the network on x to obtain predictions y_pred
3. Compute the loss of the network on the batch (e.g. mean square error), a
measure of the mismatch between y_pred and y
4. Compute the gradient of the loss with regard to
the network’s parameters (a backward pass).
5. Move the parameters a little in the opposite
direction from the gradient: - step * gradient
Gradient Descent
28
2022/9/6
• A sample is a single row of data like (questions, standard answers)
• Batch size: Number of samples used for one iteration of gradient descent
 Batch size = 1: stochastic gradient descent
 1 < Batch size < all: mini-batch gradient descent
 Batch size = all: batch gradient descent
• Epoch
 Number of times that the learning algorithm work through all training
samples (同樣題目做幾遍)
Sample, Batch size & Epochs
29
The learning rate
30
2022/9/6
overfitting
31
32
Underfitting(不及, 1次近似2次), overfitting
(過,4次近似2次),balanced(中庸)
https://towardsdatascience.com/8-simple-techniques-to-prevent-overfitting-4d443da2ef7d
2022/9/6
Preventing overfitting ─
early stopping
85% training
10% validation
5% test
33
Preventing Overfitting-Early Stopping
Source: https://www.analyticsvidhya.com/blog/2020/02/underfitting-overfitting-best-fitting-machine-learning/
https://cs231n.github.io/neural-networks-3/#ada 34
Example: at each epoch, same 16000 training data
with backpropagation and
same 4000 validation data (no overlap with training data)
without backpropagation
2022/9/6
• Gradient Decent (GD): fixed learning rate. Getting stuck in local minimums
• Adam Optimizer: Adaptive Moment Estimation (Adam) keeps separate learning rates
for each weight as well as an exponentially decaying average of previous gradients. It
is reputed to work well for both sparse matrices and noisy data.
Optimizers
35
Modeling one neuron
36
2022/9/6
• ReLU: prevent gradient diminishing during back propagation (<1*<1=<<1), used for
hidden layer [Xavier Glorot, AISTATS’11]. In 2006, people used RBM pre-training.
In 2015, people use ReLU.
• Tanh: -1~1
• Sigmoid: 0~1, sum<>1, multi-label classification (ex. NOMA, one subcarrier
multiple users)
Activation functions
37
Activation functions
• softmax (normalized exponential function): sum=1, one-label classification
38
2022/9/6
Feedforward Network
• Feedforward (forward conduction), when
we find a model for specific pattern
discrimination in many function groups,
we can use the parameters learned by this
model to predict (the loss function
determines the prediction target, which
may be classification or regression , Or a
combination of the two).
• In a feedforward network, information
only moves in one direction-starting from
the input layer and moving forward, then
through the hidden layer, and then to the
output layer.
39
Hidden layer
• Hidden layer: between the input layer
and output layer. Could have more than
one hidden layer (“deep” neural
network)
• Hyper parameters: how many layers,
how many nodes at each layer, how to
connect nodes, what the activation
function is
40
2022/9/6
Estimating the Surviving Chance of
Titanic Passengers: DNN example
• Training data: 1309
• http://tflearn.org/tutorialsquickstart.html
41
• On April 15, 1912, the Titanic sank after colliding with an iceberg, killing 1502 out
of 2224 passengers and crew. Although there was some element of luck involve in
surviving the sinking, some groups of people were more likely to survive than others,
such as women, children, and the upper-class.
Estimating the Surviving Chance of
Titanic Passengers
Data Description
survived Survived (0 = No, 1 = Yes)
pclass Passenger Class(1 = st; 2 = nd; 3 = rd;)
name Name
Sex Sex
Age Age
Sibsp兄弟姊妹+老婆丈夫數量 Number of Siblings/Spouses Aboard
Parch父母小孩的數量 Number of Parents/Children Aboard
Ticket Ticket Number
Fare Passenger Fare
42
2022/9/6
Survived Pclass Name Sex Age Sibsp Parch Ticket fare
1 1 Aubart,
Mme.
Leontine
Pauline
Female 24 0 0 PC 17477 69.3
0 2 Bowenur,
Mr.
Solomon
Male 42 0 0 211535 13
1 3 Baclini,
Miss Marie
Catherine
Female 5 2 1 2666 19.2583
0 3 Youseff,
Mr.
Gerious
Male 45.5 0 0 2628 7.225
Estimating the Surviving Chance of
Titanic Passengers
43
……
# Preprocess data
data = preprocess(data,to_ignore)
# Bulid neural network
net = tflearn.input_data(shape=[None,6])
net = tflearn.fully_connect(net,32)
net = tflearn.fully_connect(net,32)
net = tflearn.fully_connect(net,2,activation='softmax')
# Define model
model = tflearn.DNN(net)
# Start training (apply gradient descent algorithm)
model.fit(data, labels, n_epoch=10, batch_size=16, show_metric=True)
Estimating the Surviving Chance of
Titanic Passengers
44
2022/9/6
# Let's create some data for DiCaprio and Winslet
dicaprio = [3,'Jack Dawson', 'male', 19,0,01'N/A',5.0000]
winslet = [1,'Rose DeWitt Bukater','female','17,1,2,'N/A',100.0000]
# Preprocessing data
dicaprio, winslet = preprocess([dicaprio,winslet],to_ignore)
# Predict survivng chances(class1 results)
pred = model.predict([dicaprio, winslet])
print('DiCaprio Surviving Rate:',pred[0][1])
print('Winslet Surviving Rate:', pred[1][1])
Estimating the Surviving Chance of
Titanic Passengers
45
• Output:
• DiCaprio Surviving Rate: 0.13849584758251708
• Winslet Surviving Rate: 0.92201167345047
Estimating the Surviving Chance of
Titanic Passengers
46
2022/9/6
Recipe for Learning
overfitting
Don’t forget!
Preventing
Overfitting
Modify The Network
Better optimization
Strategy
http://www.gizmodo.com.au/2015/04/the-basic-recipe-for-machine-learningexplained-in-a-single-powerpoint-slide/
47
Recipe for Learning
• Modify the network such as, activation function ReLU, more hidden layer, more
nodes in each layer (underfitting)
• Better optimization strategy, such as Adam [Diederik P. Kingma, ICLR’15]
• Ways to prevent overfitting:
1. Early stopping (earlier page)
2. removing layers and reduce the size of our model: an over-complex model may
more likely overfit.
3. Dropout (next pages)
48
2022/9/6
Dropout
• Each time before computing the gradients, each neuron has p% to dropout
• The structure of the network is changed. Using the new network for training
• For each mini-batch, we resample the dropout neurons.
49
Dropout
Training:
Training:
Thinner!
50
2022/9/6
• CNN, LSTM, GRU, etc. are just different ways to connect neurons.
Different Network Structures
51
loss on training data
large small
model
bias optimization
make your
model
complex
Adam?
loss on testing data
overfitting mismatch
small
large
make your model
simpler
more training data
data augmentation
trade-off
Split your
training data
into training set
and validation
set for model
selection
General Guide
(more detailed)
Training / testing
data distribution
mismatched
Source: https://speech.ee.ntu.edu.tw/~hylee/ml/ml2021-
course-data/overfit-v6.pptx
2022/9/6
mismatch
• Training / testing data distribution mismatched
• Could use Baysian Neural Network (BNN) to detect mismatch (anomaly) and issue alarm/ re-
train
[Zhu17] L. Zhu and N. Laptev, “Deep and confident prediction for time series at Uber,” in Proc.
IEEE International Conference on Data Mining Workshops (ICDMW), 2017, pp. 103–110.
53
• The model is too simple.
𝑦 = 𝑏 + 𝑤 𝑥
𝑦 = 𝑏 + 𝑤𝑥
𝑦 = 𝑏 + 𝑐 𝑠𝑖𝑔𝑚𝑜𝑖𝑑 𝑏 + 𝑤 𝑥
More
features
Deep Learning
(more neurons, layers)
Model Bias
small loss
small loss
2022/9/6
Optimization Issue
• Large loss not always imply model bias. There is another possibility …
𝐿
𝐿 𝜽∗
𝜽
𝜽∗
large
Optimization Issue
• Gaining the insights from comparison
• Start from shallower networks (or other models), which are
easier to optimize.
• If deeper networks do not obtain smaller loss on training data,
then there is optimization issue.
• Solution: More powerful optimization technology
• Solution: More powerful optimization technology (last pages
add moment)
Ref: http://arxiv.org/abs/1512.03385
1 layer 2 layer 3 layer 4 layer 5 layer
2017 – 2020 0.28k 0.18k 0.14k 0.10k 0.34k
2022/9/6
Overfitting
• Small loss on training data, large loss on testing data. Why?
An extreme example
Training
data:
𝒙𝟏
, 𝑦 , 𝒙𝟐
, 𝑦 , … , 𝒙𝑵
, 𝑦
𝑓 𝒙 = 𝑦
𝑟𝑎𝑛𝑑𝑜𝑚
∃𝒙𝒊
= 𝒙
𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
This function obtains zero training loss, but large
testing loss.
Less than
useless …
Overfitting (not generalize well)
𝑥
𝑦
𝑥
𝑦
𝑥
𝑦
“freestyle”
Real data
distribution (not
observable)
Training data
Testing data
Flexibl
e
model
Large
loss
2022/9/6
Overfitting
𝑥
𝑦
𝑥
𝑦
Flexibl
e
model
More training
data
Data
augmentation
N-fold Cross Validation to select model
(N=3) efficiently give you “more” data for validation
Training Set
Train Train Val
Train Val Train
Val Train Train
Model 1 Model 2 Model 3
mse = 0.4
mse = 0.5
mse = 0.3
mse = 0.4
mse = 0.5
mse = 0.6
mse = 0.2
mse = 0.4
mse = 0.3
Avg mse
= 0.4
Avg mse
= 0.5
Avg mse
= 0.3
2022/9/6
local minima
local minima
Optimization Fails because ……
updates
trainin
g
loss
Not small
enough
gradient is close
to zero
saddle point
saddle point
critical point
critical point
Which one?
No way to go
No way to go escape
escape
Small Gradient …
Loss
The value of a network parameter w
Very slow at the
plateau
Stuck at local minima
𝜕𝐿 ∕ 𝜕𝑤
= 0
𝜕𝐿 ∕ 𝜕𝑤
= 0
Stuck at saddle point
𝜕𝐿 ∕ 𝜕𝑤
= 0
𝜕𝐿 ∕ 𝜕𝑤
= 0
𝜕𝐿 ∕ 𝜕𝑤
≈ 0
𝜕𝐿 ∕ 𝜕𝑤
≈ 0
Gradient Descent
2022/9/6
Small Batch v.s. Large Batch
Small Large
Speed for one update
(no parallel)
Faster Slower
Speed for one update
(with parallel)
Same Same (not too large)
Time for one epoch Slower Faster
Gradient
Noisy
Stable (weak law
of large numbers)
Optimization Better Worse
Generalization Better Worse
Batch size is a hyperparameter you have to decide.
Gradient Descent + Momentum動量
(p=mv高中物理)
Starting at 𝜽𝟎
Compute gradient 𝒈𝟎
Move to 𝜽𝟏
= 𝜽𝟎
+ 𝒎𝟏
Compute gradient 𝒈𝟏
Movement 𝒎𝟎
= 𝟎
Movement 𝒎𝟏
= λ𝒎𝟎
− 𝜂𝒈𝟎
Movement 𝒎𝟐
= λ𝒎𝟏
− 𝜂𝒈𝟏
Move to 𝜽𝟐
= 𝜽𝟏
+ 𝒎𝟐
Movement
Gradient
𝜽𝟎
𝜽𝟏
𝜽𝟐
𝜽𝟑
𝒈𝟎
𝒈𝟏
𝒈𝟐
𝒈𝟑
Movement not just based
on gradient, but previous
movement.
Movement
of the last step
Movement: movement of last
step minus gradient at present
𝒎𝟏
𝒎𝟐
𝒎𝟑
2022/9/6
Movement =
Negative of 𝜕𝐿∕𝜕𝑤 + Last Movement
Gradient Descent + Momentum
loss
𝜕𝐿∕𝜕𝑤 = 0
Negative of 𝜕𝐿 ∕ 𝜕𝑤
Last Movement
Real Movement
Moment helps to get out
of local minimum

Mais conteúdo relacionado

Semelhante a AIML2 DNN 3.5hr (111-1).pdf

Introduction to Applied Machine Learning
Introduction to Applied Machine LearningIntroduction to Applied Machine Learning
Introduction to Applied Machine LearningSheilaJimenezMorejon
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Julien SIMON
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar
 
Novel algorithms for Knowledge discovery from neural networks in Classificat...
Novel algorithms for  Knowledge discovery from neural networks in Classificat...Novel algorithms for  Knowledge discovery from neural networks in Classificat...
Novel algorithms for Knowledge discovery from neural networks in Classificat...Dr.(Mrs).Gethsiyal Augasta
 
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...IJCNCJournal
 
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...IJCNCJournal
 
Introduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesIntroduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesXavier Rafael Palou
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlowBarbara Fusinska
 
Basic Learning Algorithms of ANN
Basic Learning Algorithms of ANNBasic Learning Algorithms of ANN
Basic Learning Algorithms of ANNwaseem khan
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101Felipe Prado
 
Neural Networks in Data Mining - “An Overview”
Neural Networks  in Data Mining -   “An Overview”Neural Networks  in Data Mining -   “An Overview”
Neural Networks in Data Mining - “An Overview”Dr.(Mrs).Gethsiyal Augasta
 
IRJET - A Survey on Machine Learning Algorithms, Techniques and Applications
IRJET - A Survey on Machine Learning Algorithms, Techniques and ApplicationsIRJET - A Survey on Machine Learning Algorithms, Techniques and Applications
IRJET - A Survey on Machine Learning Algorithms, Techniques and ApplicationsIRJET Journal
 
CLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxCLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxShwetapadmaBabu1
 
Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio Armando Vieira
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsYoung-Geun Choi
 
Map-Reduce for Machine Learning on Multicore
Map-Reduce for Machine Learning on MulticoreMap-Reduce for Machine Learning on Multicore
Map-Reduce for Machine Learning on Multicoreillidan2004
 

Semelhante a AIML2 DNN 3.5hr (111-1).pdf (20)

presentationIDC - 14MAY2015
presentationIDC - 14MAY2015presentationIDC - 14MAY2015
presentationIDC - 14MAY2015
 
Introduction to Applied Machine Learning
Introduction to Applied Machine LearningIntroduction to Applied Machine Learning
Introduction to Applied Machine Learning
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
 
Novel algorithms for Knowledge discovery from neural networks in Classificat...
Novel algorithms for  Knowledge discovery from neural networks in Classificat...Novel algorithms for  Knowledge discovery from neural networks in Classificat...
Novel algorithms for Knowledge discovery from neural networks in Classificat...
 
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...
 
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...
Effective Multi-Stage Training Model for Edge Computing Devices in Intrusion ...
 
Introduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesIntroduction to conventional machine learning techniques
Introduction to conventional machine learning techniques
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlow
 
Deeplearning
Deeplearning Deeplearning
Deeplearning
 
Large Scale Distributed Deep Networks
Large Scale Distributed Deep NetworksLarge Scale Distributed Deep Networks
Large Scale Distributed Deep Networks
 
Basic Learning Algorithms of ANN
Basic Learning Algorithms of ANNBasic Learning Algorithms of ANN
Basic Learning Algorithms of ANN
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101
 
Neural Networks in Data Mining - “An Overview”
Neural Networks  in Data Mining -   “An Overview”Neural Networks  in Data Mining -   “An Overview”
Neural Networks in Data Mining - “An Overview”
 
IRJET - A Survey on Machine Learning Algorithms, Techniques and Applications
IRJET - A Survey on Machine Learning Algorithms, Techniques and ApplicationsIRJET - A Survey on Machine Learning Algorithms, Techniques and Applications
IRJET - A Survey on Machine Learning Algorithms, Techniques and Applications
 
CLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptxCLUSTER ANALYSIS ALGORITHMS.pptx
CLUSTER ANALYSIS ALGORITHMS.pptx
 
Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
 
Map-Reduce for Machine Learning on Multicore
Map-Reduce for Machine Learning on MulticoreMap-Reduce for Machine Learning on Multicore
Map-Reduce for Machine Learning on Multicore
 
deep CNN vs conventional ML
deep CNN vs conventional MLdeep CNN vs conventional ML
deep CNN vs conventional ML
 

Mais de ssuserb4d806

Analog_chap_02.ppt
Analog_chap_02.pptAnalog_chap_02.ppt
Analog_chap_02.pptssuserb4d806
 
Analog_chap_01.ppt
Analog_chap_01.pptAnalog_chap_01.ppt
Analog_chap_01.pptssuserb4d806
 
1-Introduction and Crystal Structure of Solids-已解鎖.pdf
1-Introduction and Crystal Structure of Solids-已解鎖.pdf1-Introduction and Crystal Structure of Solids-已解鎖.pdf
1-Introduction and Crystal Structure of Solids-已解鎖.pdfssuserb4d806
 
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptx台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptxssuserb4d806
 
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptx台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptxssuserb4d806
 
Assessment_of_Fetal_and_Maternal_Well-Being_During_Pregnancy_Using_Passive_We...
Assessment_of_Fetal_and_Maternal_Well-Being_During_Pregnancy_Using_Passive_We...Assessment_of_Fetal_and_Maternal_Well-Being_During_Pregnancy_Using_Passive_We...
Assessment_of_Fetal_and_Maternal_Well-Being_During_Pregnancy_Using_Passive_We...ssuserb4d806
 
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptx台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptxssuserb4d806
 
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptx台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptxssuserb4d806
 
RFIC_LNA_Simulation.ppt
RFIC_LNA_Simulation.pptRFIC_LNA_Simulation.ppt
RFIC_LNA_Simulation.pptssuserb4d806
 
AIML4 CNN lab256 1hr (111-1).pdf
AIML4 CNN lab256 1hr (111-1).pdfAIML4 CNN lab256 1hr (111-1).pdf
AIML4 CNN lab256 1hr (111-1).pdfssuserb4d806
 
AIML4 CNN lab 5-1 BreastCancer ML course student report 2022 spring (111-1).pdf
AIML4 CNN lab 5-1 BreastCancer ML course student report 2022 spring (111-1).pdfAIML4 CNN lab 5-1 BreastCancer ML course student report 2022 spring (111-1).pdf
AIML4 CNN lab 5-1 BreastCancer ML course student report 2022 spring (111-1).pdfssuserb4d806
 
Lecture 1 System View.pptx - 已修復.pdf
Lecture 1 System View.pptx  -  已修復.pdfLecture 1 System View.pptx  -  已修復.pdf
Lecture 1 System View.pptx - 已修復.pdfssuserb4d806
 
Training L1 Thinking 2022702.pptx.pptx
Training L1 Thinking 2022702.pptx.pptxTraining L1 Thinking 2022702.pptx.pptx
Training L1 Thinking 2022702.pptx.pptxssuserb4d806
 
Lecture08-Arithmetic Code-4-Int Imp-P2.pdf
Lecture08-Arithmetic Code-4-Int Imp-P2.pdfLecture08-Arithmetic Code-4-Int Imp-P2.pdf
Lecture08-Arithmetic Code-4-Int Imp-P2.pdfssuserb4d806
 
Lecture09-SQ-P2.pdf
Lecture09-SQ-P2.pdfLecture09-SQ-P2.pdf
Lecture09-SQ-P2.pdfssuserb4d806
 
Lecture06-Arithmetic Code-2-Algorithm Implementation-P2.pdf
Lecture06-Arithmetic Code-2-Algorithm Implementation-P2.pdfLecture06-Arithmetic Code-2-Algorithm Implementation-P2.pdf
Lecture06-Arithmetic Code-2-Algorithm Implementation-P2.pdfssuserb4d806
 

Mais de ssuserb4d806 (20)

5.pdf
5.pdf5.pdf
5.pdf
 
4.pdf
4.pdf4.pdf
4.pdf
 
Analog_chap_02.ppt
Analog_chap_02.pptAnalog_chap_02.ppt
Analog_chap_02.ppt
 
Analog_chap_01.ppt
Analog_chap_01.pptAnalog_chap_01.ppt
Analog_chap_01.ppt
 
1-Introduction and Crystal Structure of Solids-已解鎖.pdf
1-Introduction and Crystal Structure of Solids-已解鎖.pdf1-Introduction and Crystal Structure of Solids-已解鎖.pdf
1-Introduction and Crystal Structure of Solids-已解鎖.pdf
 
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptx台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptx
 
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptx台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptx
 
Assessment_of_Fetal_and_Maternal_Well-Being_During_Pregnancy_Using_Passive_We...
Assessment_of_Fetal_and_Maternal_Well-Being_During_Pregnancy_Using_Passive_We...Assessment_of_Fetal_and_Maternal_Well-Being_During_Pregnancy_Using_Passive_We...
Assessment_of_Fetal_and_Maternal_Well-Being_During_Pregnancy_Using_Passive_We...
 
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptx台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_2.pptx
 
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptx台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptx
台北科技大學電子所_可穿戴式系統設計_期末報告 1_賴紀廷_109368501_20230106_1.pptx
 
RFIC_LNA_Simulation.ppt
RFIC_LNA_Simulation.pptRFIC_LNA_Simulation.ppt
RFIC_LNA_Simulation.ppt
 
AIML4 CNN lab256 1hr (111-1).pdf
AIML4 CNN lab256 1hr (111-1).pdfAIML4 CNN lab256 1hr (111-1).pdf
AIML4 CNN lab256 1hr (111-1).pdf
 
AIML4 CNN lab 5-1 BreastCancer ML course student report 2022 spring (111-1).pdf
AIML4 CNN lab 5-1 BreastCancer ML course student report 2022 spring (111-1).pdfAIML4 CNN lab 5-1 BreastCancer ML course student report 2022 spring (111-1).pdf
AIML4 CNN lab 5-1 BreastCancer ML course student report 2022 spring (111-1).pdf
 
virtuoso
virtuosovirtuoso
virtuoso
 
Lecture 1 System View.pptx - 已修復.pdf
Lecture 1 System View.pptx  -  已修復.pdfLecture 1 System View.pptx  -  已修復.pdf
Lecture 1 System View.pptx - 已修復.pdf
 
Labs_20210809.pdf
Labs_20210809.pdfLabs_20210809.pdf
Labs_20210809.pdf
 
Training L1 Thinking 2022702.pptx.pptx
Training L1 Thinking 2022702.pptx.pptxTraining L1 Thinking 2022702.pptx.pptx
Training L1 Thinking 2022702.pptx.pptx
 
Lecture08-Arithmetic Code-4-Int Imp-P2.pdf
Lecture08-Arithmetic Code-4-Int Imp-P2.pdfLecture08-Arithmetic Code-4-Int Imp-P2.pdf
Lecture08-Arithmetic Code-4-Int Imp-P2.pdf
 
Lecture09-SQ-P2.pdf
Lecture09-SQ-P2.pdfLecture09-SQ-P2.pdf
Lecture09-SQ-P2.pdf
 
Lecture06-Arithmetic Code-2-Algorithm Implementation-P2.pdf
Lecture06-Arithmetic Code-2-Algorithm Implementation-P2.pdfLecture06-Arithmetic Code-2-Algorithm Implementation-P2.pdf
Lecture06-Arithmetic Code-2-Algorithm Implementation-P2.pdf
 

Último

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 

Último (20)

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 

AIML2 DNN 3.5hr (111-1).pdf

  • 1. 2022/9/6 1 Deep Neural Networks (DNN) References • Tan, Steinbach, Karpatne, Kumar, Introduction to Data Mining, 2e, 2019. • 鄭羽熙,人工智慧關鍵技術初探,2019.8.26 • Jinwoo Shin, Deep Learning for Optimization, Communication and Networks, The East Asian School of Information Theory & Communication (EASITC), August 6-10, 2018 • https://speech.ee.ntu.edu.tw/~hylee/ml/2021-spring.html 2
  • 2. 2022/9/6 Outline • Deep Neural Networks (DNN)-fully connected • Training DNN • Recipe for learning • Ways to prevent overfitting 3 Deep Neural Networks (DNN) • aka or related to Deep Learning, Deep Belief Networks • DNN has more than 1 (hidden) layer between input and output • The series of layers of DNN do feature identification and processing in a series of stages, just as our brains seem to. 4
  • 3. 2022/9/6 • Inspired by biological neuron • Many different types • Dendrites (枝狀) can perform complex non-linear computations • Synapses (突觸 neuron-to-neuron) are not a single weight but a complex non-linear dynamical system DNN : Biological neuron 5 DNN : Artificial neuron 6
  • 4. 2022/9/6 Linear regression ─ fitting a model to data 7 Linear regression ─ fitting a model to data Loss function: 𝐸 = 1 2 (𝑡 − 𝑦 ) t: target 8
  • 5. 2022/9/6 The loss function Loss function example for regression: Mean square error (MSE) between labeled output t (training data, correct answers) and output y 𝐸 = 1 2 (𝑡 − 𝑦 ) 9 • Loss function example for classification: 𝐿𝑜𝑠𝑠 𝑊, 𝑏 = 1 𝐾 − 𝑌 𝑖 ln 𝑌 𝑖 + 1 − 𝑌 𝑖 ln 1 − 𝑌 (𝑖) 10
  • 6. 2022/9/6 Loss function MSE (Mean- Square Error) MAE (Mean-Absolute Error) CE (Categorical Cross Entropy) BCE (Binary Cross Entropy) advantage Faster convergence Not so sensitive to outliers More than two classes Two classes disadvantage Sensitive to outliers (離群 值) Slower convergence (smaller slope) applications Regression problem Classification problem 11 Minimizing the loss 12
  • 7. 2022/9/6 • Least Mean Square (LMS) also in the opposite direction from the gradient with step size μ (similar to learning rate μ in AI on p.27) to approach the minimum 13 Conventional (non-AI) way to minimize the loss It’s a neural network 14
  • 8. 2022/9/6 Multiple regression j 15 Computing the gradient (mini-batch p>1, weak law of large number, i.i.d.) 16
  • 9. 2022/9/6 Computing the gradient (Chain rule in Calculus) o: next layer i: previous layer 17 • Training data : 2-dimentional data • Labels : 0 or 1 • Task : Binary classification Training DNN : A simple example A dataset Fields 1.4 2.7 3.8 3.4 6.4 2.8 4.1 0.1 etc … class 0 1 1 0 Hidden 𝑊1 𝑊2 https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt Inputs Outputs
  • 10. 2022/9/6 • Initialize with random weights 𝑊) and 𝑊* Training DNN : A simple example A dataset Fields 1.4 2.7 3.8 3.4 6.4 2.8 4.1 0.1 etc … class 0 1 1 0 Hidden 𝑊1 𝑊2 https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt Inputs Outputs • Input training pattern 1 Training DNN : A simple example A dataset Fields class 1.4 2.7 0 1 1 0 3.8 3.4 6.4 2.8 4.1 0.1 etc … Hidden 𝑊1 𝑊2 https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt Inputs Outputs 1.4 2.7
  • 11. 2022/9/6 • Forward neural network to get output Training DNN : A simple example A dataset Fields class 1.4 2.7 0 1 1 0 3.8 3.4 6.4 2.8 4.1 0.1 etc … Hidden 𝑊1 𝑊2 Inputs Outputs 1.4 2.7 0.8 https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt • Compare with target output Training DNN : A simple example A dataset Fields class 1.4 2.7 0 3.8 3.4 1 6.4 2.8 1 4.1 0.1 etc … 0 Hidden 𝑊1 𝑊2 https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt Inputs Outputs 1.4 2.7 0.8 0 Error : 0.8
  • 12. 2022/9/6 • Adjust weights based on the error signal • Back propagation Training DNN : A simple example A dataset Fields class 1.4 2.7 0 3.8 3.4 1 6.4 2.8 1 4.1 0.1 etc … 0 Hidden 𝑊1 𝑊2 Inputs Outputs 1.4 2.7 0.8 0 Error : 0.8 https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt • Input training pattern 2 Training DNN : A simple example A dataset Fields 1.4 2.7 class 0 3.8 3.4 1 1 0 6.4 2.8 4.1 0.1 etc … Hidden 𝑊1 𝑊2 https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt Inputs Outputs 3.8 3.4
  • 13. 2022/9/6 • Input training pattern 2 • Repeat this process with random training samples • Making a slight weight adjustment in a direction to reduce the error Training DNN : A simple example A dataset Fields 1.4 2.7 class 0 3.8 3.4 1 1 0 6.4 2.8 4.1 0.1 etc … Hidden 𝑊1 𝑊2 https://www.macs.hw.ac.uk/~dwcorne/Teaching/introdl.ppt Inputs Outputs 3.8 3.4 Backpropagation /Testing 26
  • 14. 2022/9/6 Summary (μ: learning rate like LMS step size) 27 Result from p44 1. Draw a batch of training samples x and corresponding targets y 2. Run the network on x to obtain predictions y_pred 3. Compute the loss of the network on the batch (e.g. mean square error), a measure of the mismatch between y_pred and y 4. Compute the gradient of the loss with regard to the network’s parameters (a backward pass). 5. Move the parameters a little in the opposite direction from the gradient: - step * gradient Gradient Descent 28
  • 15. 2022/9/6 • A sample is a single row of data like (questions, standard answers) • Batch size: Number of samples used for one iteration of gradient descent  Batch size = 1: stochastic gradient descent  1 < Batch size < all: mini-batch gradient descent  Batch size = all: batch gradient descent • Epoch  Number of times that the learning algorithm work through all training samples (同樣題目做幾遍) Sample, Batch size & Epochs 29 The learning rate 30
  • 17. 2022/9/6 Preventing overfitting ─ early stopping 85% training 10% validation 5% test 33 Preventing Overfitting-Early Stopping Source: https://www.analyticsvidhya.com/blog/2020/02/underfitting-overfitting-best-fitting-machine-learning/ https://cs231n.github.io/neural-networks-3/#ada 34 Example: at each epoch, same 16000 training data with backpropagation and same 4000 validation data (no overlap with training data) without backpropagation
  • 18. 2022/9/6 • Gradient Decent (GD): fixed learning rate. Getting stuck in local minimums • Adam Optimizer: Adaptive Moment Estimation (Adam) keeps separate learning rates for each weight as well as an exponentially decaying average of previous gradients. It is reputed to work well for both sparse matrices and noisy data. Optimizers 35 Modeling one neuron 36
  • 19. 2022/9/6 • ReLU: prevent gradient diminishing during back propagation (<1*<1=<<1), used for hidden layer [Xavier Glorot, AISTATS’11]. In 2006, people used RBM pre-training. In 2015, people use ReLU. • Tanh: -1~1 • Sigmoid: 0~1, sum<>1, multi-label classification (ex. NOMA, one subcarrier multiple users) Activation functions 37 Activation functions • softmax (normalized exponential function): sum=1, one-label classification 38
  • 20. 2022/9/6 Feedforward Network • Feedforward (forward conduction), when we find a model for specific pattern discrimination in many function groups, we can use the parameters learned by this model to predict (the loss function determines the prediction target, which may be classification or regression , Or a combination of the two). • In a feedforward network, information only moves in one direction-starting from the input layer and moving forward, then through the hidden layer, and then to the output layer. 39 Hidden layer • Hidden layer: between the input layer and output layer. Could have more than one hidden layer (“deep” neural network) • Hyper parameters: how many layers, how many nodes at each layer, how to connect nodes, what the activation function is 40
  • 21. 2022/9/6 Estimating the Surviving Chance of Titanic Passengers: DNN example • Training data: 1309 • http://tflearn.org/tutorialsquickstart.html 41 • On April 15, 1912, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. Although there was some element of luck involve in surviving the sinking, some groups of people were more likely to survive than others, such as women, children, and the upper-class. Estimating the Surviving Chance of Titanic Passengers Data Description survived Survived (0 = No, 1 = Yes) pclass Passenger Class(1 = st; 2 = nd; 3 = rd;) name Name Sex Sex Age Age Sibsp兄弟姊妹+老婆丈夫數量 Number of Siblings/Spouses Aboard Parch父母小孩的數量 Number of Parents/Children Aboard Ticket Ticket Number Fare Passenger Fare 42
  • 22. 2022/9/6 Survived Pclass Name Sex Age Sibsp Parch Ticket fare 1 1 Aubart, Mme. Leontine Pauline Female 24 0 0 PC 17477 69.3 0 2 Bowenur, Mr. Solomon Male 42 0 0 211535 13 1 3 Baclini, Miss Marie Catherine Female 5 2 1 2666 19.2583 0 3 Youseff, Mr. Gerious Male 45.5 0 0 2628 7.225 Estimating the Surviving Chance of Titanic Passengers 43 …… # Preprocess data data = preprocess(data,to_ignore) # Bulid neural network net = tflearn.input_data(shape=[None,6]) net = tflearn.fully_connect(net,32) net = tflearn.fully_connect(net,32) net = tflearn.fully_connect(net,2,activation='softmax') # Define model model = tflearn.DNN(net) # Start training (apply gradient descent algorithm) model.fit(data, labels, n_epoch=10, batch_size=16, show_metric=True) Estimating the Surviving Chance of Titanic Passengers 44
  • 23. 2022/9/6 # Let's create some data for DiCaprio and Winslet dicaprio = [3,'Jack Dawson', 'male', 19,0,01'N/A',5.0000] winslet = [1,'Rose DeWitt Bukater','female','17,1,2,'N/A',100.0000] # Preprocessing data dicaprio, winslet = preprocess([dicaprio,winslet],to_ignore) # Predict survivng chances(class1 results) pred = model.predict([dicaprio, winslet]) print('DiCaprio Surviving Rate:',pred[0][1]) print('Winslet Surviving Rate:', pred[1][1]) Estimating the Surviving Chance of Titanic Passengers 45 • Output: • DiCaprio Surviving Rate: 0.13849584758251708 • Winslet Surviving Rate: 0.92201167345047 Estimating the Surviving Chance of Titanic Passengers 46
  • 24. 2022/9/6 Recipe for Learning overfitting Don’t forget! Preventing Overfitting Modify The Network Better optimization Strategy http://www.gizmodo.com.au/2015/04/the-basic-recipe-for-machine-learningexplained-in-a-single-powerpoint-slide/ 47 Recipe for Learning • Modify the network such as, activation function ReLU, more hidden layer, more nodes in each layer (underfitting) • Better optimization strategy, such as Adam [Diederik P. Kingma, ICLR’15] • Ways to prevent overfitting: 1. Early stopping (earlier page) 2. removing layers and reduce the size of our model: an over-complex model may more likely overfit. 3. Dropout (next pages) 48
  • 25. 2022/9/6 Dropout • Each time before computing the gradients, each neuron has p% to dropout • The structure of the network is changed. Using the new network for training • For each mini-batch, we resample the dropout neurons. 49 Dropout Training: Training: Thinner! 50
  • 26. 2022/9/6 • CNN, LSTM, GRU, etc. are just different ways to connect neurons. Different Network Structures 51 loss on training data large small model bias optimization make your model complex Adam? loss on testing data overfitting mismatch small large make your model simpler more training data data augmentation trade-off Split your training data into training set and validation set for model selection General Guide (more detailed) Training / testing data distribution mismatched Source: https://speech.ee.ntu.edu.tw/~hylee/ml/ml2021- course-data/overfit-v6.pptx
  • 27. 2022/9/6 mismatch • Training / testing data distribution mismatched • Could use Baysian Neural Network (BNN) to detect mismatch (anomaly) and issue alarm/ re- train [Zhu17] L. Zhu and N. Laptev, “Deep and confident prediction for time series at Uber,” in Proc. IEEE International Conference on Data Mining Workshops (ICDMW), 2017, pp. 103–110. 53 • The model is too simple. 𝑦 = 𝑏 + 𝑤 𝑥 𝑦 = 𝑏 + 𝑤𝑥 𝑦 = 𝑏 + 𝑐 𝑠𝑖𝑔𝑚𝑜𝑖𝑑 𝑏 + 𝑤 𝑥 More features Deep Learning (more neurons, layers) Model Bias small loss small loss
  • 28. 2022/9/6 Optimization Issue • Large loss not always imply model bias. There is another possibility … 𝐿 𝐿 𝜽∗ 𝜽 𝜽∗ large Optimization Issue • Gaining the insights from comparison • Start from shallower networks (or other models), which are easier to optimize. • If deeper networks do not obtain smaller loss on training data, then there is optimization issue. • Solution: More powerful optimization technology • Solution: More powerful optimization technology (last pages add moment) Ref: http://arxiv.org/abs/1512.03385 1 layer 2 layer 3 layer 4 layer 5 layer 2017 – 2020 0.28k 0.18k 0.14k 0.10k 0.34k
  • 29. 2022/9/6 Overfitting • Small loss on training data, large loss on testing data. Why? An extreme example Training data: 𝒙𝟏 , 𝑦 , 𝒙𝟐 , 𝑦 , … , 𝒙𝑵 , 𝑦 𝑓 𝒙 = 𝑦 𝑟𝑎𝑛𝑑𝑜𝑚 ∃𝒙𝒊 = 𝒙 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 This function obtains zero training loss, but large testing loss. Less than useless … Overfitting (not generalize well) 𝑥 𝑦 𝑥 𝑦 𝑥 𝑦 “freestyle” Real data distribution (not observable) Training data Testing data Flexibl e model Large loss
  • 30. 2022/9/6 Overfitting 𝑥 𝑦 𝑥 𝑦 Flexibl e model More training data Data augmentation N-fold Cross Validation to select model (N=3) efficiently give you “more” data for validation Training Set Train Train Val Train Val Train Val Train Train Model 1 Model 2 Model 3 mse = 0.4 mse = 0.5 mse = 0.3 mse = 0.4 mse = 0.5 mse = 0.6 mse = 0.2 mse = 0.4 mse = 0.3 Avg mse = 0.4 Avg mse = 0.5 Avg mse = 0.3
  • 31. 2022/9/6 local minima local minima Optimization Fails because …… updates trainin g loss Not small enough gradient is close to zero saddle point saddle point critical point critical point Which one? No way to go No way to go escape escape Small Gradient … Loss The value of a network parameter w Very slow at the plateau Stuck at local minima 𝜕𝐿 ∕ 𝜕𝑤 = 0 𝜕𝐿 ∕ 𝜕𝑤 = 0 Stuck at saddle point 𝜕𝐿 ∕ 𝜕𝑤 = 0 𝜕𝐿 ∕ 𝜕𝑤 = 0 𝜕𝐿 ∕ 𝜕𝑤 ≈ 0 𝜕𝐿 ∕ 𝜕𝑤 ≈ 0 Gradient Descent
  • 32. 2022/9/6 Small Batch v.s. Large Batch Small Large Speed for one update (no parallel) Faster Slower Speed for one update (with parallel) Same Same (not too large) Time for one epoch Slower Faster Gradient Noisy Stable (weak law of large numbers) Optimization Better Worse Generalization Better Worse Batch size is a hyperparameter you have to decide. Gradient Descent + Momentum動量 (p=mv高中物理) Starting at 𝜽𝟎 Compute gradient 𝒈𝟎 Move to 𝜽𝟏 = 𝜽𝟎 + 𝒎𝟏 Compute gradient 𝒈𝟏 Movement 𝒎𝟎 = 𝟎 Movement 𝒎𝟏 = λ𝒎𝟎 − 𝜂𝒈𝟎 Movement 𝒎𝟐 = λ𝒎𝟏 − 𝜂𝒈𝟏 Move to 𝜽𝟐 = 𝜽𝟏 + 𝒎𝟐 Movement Gradient 𝜽𝟎 𝜽𝟏 𝜽𝟐 𝜽𝟑 𝒈𝟎 𝒈𝟏 𝒈𝟐 𝒈𝟑 Movement not just based on gradient, but previous movement. Movement of the last step Movement: movement of last step minus gradient at present 𝒎𝟏 𝒎𝟐 𝒎𝟑
  • 33. 2022/9/6 Movement = Negative of 𝜕𝐿∕𝜕𝑤 + Last Movement Gradient Descent + Momentum loss 𝜕𝐿∕𝜕𝑤 = 0 Negative of 𝜕𝐿 ∕ 𝜕𝑤 Last Movement Real Movement Moment helps to get out of local minimum