SlideShare uma empresa Scribd logo
1 de 14
Understanding of Backpropagation
Kang, Min-Guk
𝑋1
𝑋2
𝑧(1)
| 𝑎(1)
𝑧(2)
| 𝑎(2)
𝑧(3)
| 𝑎(3)
𝑧(4)
| 𝑎(4)
𝑧(5)
| 𝑎(5)
𝑧(6)
| 𝑎(6)
𝑤13 𝑤36
𝑤14
𝑤15
𝑤23
𝑤24
𝑤25
𝑤46
𝑤56
𝑤𝑖𝑛,1
𝑤𝑖𝑛,2
1
1. What is Backpropagation ?
1. Definition
Backpropagation is an algorithm for supervised learning of artificial neural networks using gradient descent.
2. History
Backpropagation algorism was developed in the 1970s, but in 1986, Rumelhart, Hinton and Williams
showed experimentally that this method can generate useful internal representations of incoming data in
Hidden layers of neural networks.
3. How to use Backpropagation?
Backpropagation consists of using simple chain rules. However, we often use non-linear functions
for activation functions, It is hard for us to use backpropagation.
(In this case, I will use sigmoid function for activation function.)
2
2. Preparations
1. Cost function(Loss function)
I will use below cost function(𝑦𝑎 is value of hypothesis, 𝑦𝑡 is value of true)
𝐶 =
1
2
(𝑦𝑎 − 𝑦𝑡)2
2. Derivative of sigmoid function
𝑑𝑆(𝑧)
𝑑𝑧
=
1
(1+𝑒−𝑧)2 × −1 × −𝑒−𝑧 = 𝑆(𝑧)(1 − 𝑆(𝑧))
∵ Sigmoid function = S(z) =
1
1+𝑒−𝑧 , 𝐹 𝑧 =(
1
𝑔 𝑧
) =
− 𝑔 𝑧
𝑔 𝑧 2
3. How to renew weights using Gradient descent
𝑊𝑖→𝑗,𝑛𝑒𝑤 = 𝑊𝑖→𝑗,𝑜𝑙𝑑 − η
𝜕𝐶
𝜕𝑊 𝑖→𝑗,𝑜𝑙𝑑
(η is learning rate)
3
3. Jump to the Backpropagation
1. Derivative Relationship between weights
1-1. The weight update is dependent on derivatives that reside previous layers.(The Word previous means it is located right side )
𝐶 =
1
2
(𝑦𝑎 − 𝑦𝑡)2
→
𝜕𝐶
𝜕𝑊2,3
= (𝑦𝑎 − 𝑦𝑡) ×
𝜕𝑦 𝑎
𝜕𝑊2,3
= (𝑦𝑎 − 𝑦𝑡) ×
𝜕
𝜕𝑊2,3
[σ{𝑧(3)
}] (σ is sigmoid function)
𝜕𝐶
𝜕𝑊2,3
= (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3)
} × [1 − σ{𝑧(3)
}] ×
𝜕𝑍3
𝜕𝑊2,3
= (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3)
} × [1 − σ{𝑧(3)
}] ×
𝜕
𝜕𝑊2,3
( 𝑎(2)
𝑤2,3)
∴
𝜕𝐶
𝜕𝑊2,3
= (𝑦𝑎 − 𝑦𝑡)σ{𝑧(3)
}[1 − σ{𝑧(3)
}] 𝑎(2)
1𝑋𝑖𝑛 2 3
𝑤1,2 𝑤2,3
𝑤𝑖𝑛,1
𝑧(2)
| 𝑎(2)
𝑧(1)
| 𝑎(1)
𝑧(3)
| 𝑎(3)
𝑎(3)
= 𝑦𝑎
Feed forward
Backpropagation
4
3. Jump to the Backpropagation
1. Derivative Relationship between weights
1-1. The weight update is dependent on derivatives that reside previous layers.(The Word previous means it is located right side )
𝐶 =
1
2
(𝑦𝑎 − 𝑦𝑡)2
→
𝜕𝐶
𝜕𝑊1,2
= (𝑦𝑎 − 𝑦𝑡) ×
𝜕𝑦 𝑎
𝜕𝑊1,2
= (𝑦𝑎 − 𝑦𝑡) ×
𝜕
𝜕𝑊1,2
[σ{𝑧(3)
}] (σ is sigmoid function)
𝜕𝐶
𝜕𝑊1,2
= (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3)
} × [1 − σ{𝑧(3)
}] ×
𝜕𝑍3
𝜕𝑊1,2
= (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3)
} × [1 − σ{𝑧(3)
}] ×
𝜕
𝜕𝑊1,2
( 𝑎(2)
𝑤2,3)
𝜕𝐶
𝜕𝑊1,2
= (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3)
} × [1 − σ{𝑧(3)
}] × 𝑤2,3 ×
𝜕
𝜕𝑊1,2
𝑎(2)
= (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3)
} × [1 − σ{𝑧(3)
}] × 𝑤2,3 ×
𝜕
𝜕𝑊1,2
σ{𝑧(2)
}
∴
𝜕𝐶
𝜕𝑊1,2
= (𝑦𝑎 − 𝑦𝑡)σ{𝑧(3)
}[1 − σ{𝑧(3)
}] 𝑤2,3 σ{𝑧(2)
}[1 − σ{𝑧(2)
} ] 𝑎(1)
1𝑋𝑖𝑛 2 3
𝑤1,2 𝑤2,3
𝑤𝑖𝑛,1
𝑧(2)
| 𝑎(2)
𝑧(1)
| 𝑎(1)
𝑧(3)
| 𝑎(3)
𝑎(3)
= 𝑦𝑎
Feed forward
Backpropagation
5
3. Jump to the Backpropagation
1. Derivative Relationship between weights
1-1. The weight update is dependent on derivatives that reside previous layers.(The Word previous means it is located right side )
𝐶 =
1
2
(𝑦𝑎 − 𝑦𝑡)2
→
𝜕𝐶
𝜕𝑊 𝑖𝑛,1
= (𝑦𝑎 − 𝑦𝑡) ×
𝜕𝑦 𝑎
𝜕𝑊 𝑖𝑛,1
= (𝑦𝑎 − 𝑦𝑡) ×
𝜕
𝜕𝑊 𝑖𝑛,1
[σ{𝑧(3)
}] (σ is sigmoid function)
Using same way, we will get below equation.
𝜕𝐶
𝜕𝑊 𝑖𝑛,1
= (𝑦𝑎 − 𝑦𝑡)σ{𝑧(3)
}[1 − σ{𝑧(3)
}] 𝑤2,3 σ{𝑧(2)
}[1 − σ{𝑧(2)
} ] 𝑤1,2 σ{𝑧(1)
}[1 − σ{𝑧(1)
} ] 𝑋𝑖𝑛
1𝑋𝑖𝑛 2 3
𝑤1,2 𝑤2,3
𝑤𝑖𝑛,1
𝑧(2)
| 𝑎(2)
𝑧(1)
| 𝑎(1)
𝑧(3)
| 𝑎(3)
𝑎(3)
= 𝑦𝑎
Feed forward
Backpropagation
6
3. Jump to the Backpropagation
1. Derivative Relationship between weights
1-2. The weight update is dependent on derivatives that reside on both paths.
To get the result, you have to do more tedious calculations than the previous one. So I now just write the result of it.
If you want to know the calculation process, look at the next slide!
𝜕𝐶
𝜕𝑊 𝑖𝑛,1
= (𝑦𝑎 − 𝑦𝑡) 𝑋𝑖𝑛[ σ{𝑧(2)
}[1 − σ{𝑧(2)
}]𝑤2,4 σ{𝑧(1)
}[1 − σ{𝑧(1)
}] 𝑤1,2 + σ{𝑧(3)
}[1 − σ{𝑧(3)
}]𝑤3,4 σ{𝑧(1)
}[1 − σ{𝑧(1)
}]𝑤1,3]
① ② ③ ④
2
3
𝑋𝑖𝑛 1
𝑤𝑖𝑛,1
4 𝑎(3) = 𝑦𝑎
Feed forward
𝑧(4)
| 𝑎(4)
𝑧(3)
| 𝑎(3)
𝑧(1)| 𝑎(1)
𝑤1,2 𝑤2,4
𝑤3,4𝑤1,3
𝑧(2)
| 𝑎(2)
①②
③④
7
2
3
𝑋𝑖𝑛 1
𝑤𝑖𝑛,1
4 𝑎(3)
= 𝑦𝑎
Feed forward
𝑧(4)
| 𝑎(4)
𝑧(3)
| 𝑎(3)
𝑧(1)
| 𝑎(1)
𝑤1,2 𝑤2,4
𝑤3,4𝑤1,3
𝑧(2)
| 𝑎(2)
𝜕𝐶
𝜕𝑊 𝑖𝑛,1
=
𝜕
𝜕𝑊 𝑖𝑛,1
1
2
(𝑦𝑎 − 𝑦𝑡)2
= 𝑦𝑎 − 𝑦𝑡 (
𝜕
𝜕𝑊 𝑖𝑛,1
(σ{𝑧 2
}𝑤2,4 + σ{𝑧 3
}𝑤3,4))
𝜕𝐶
𝜕𝑊 𝑖𝑛,1
= 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4
𝜕
𝜕𝑊 𝑖𝑛,1
σ{𝑧 2
} + 𝑤3,4
𝜕
𝜕𝑊 𝑖𝑛,1
σ{𝑧 3
} ] = 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4σ{𝑧 2
}
𝜕
𝜕𝑊 𝑖𝑛,1
(σ{𝑧 1
}𝑤1,2)+
𝑤3,4σ{𝑧 3
}
𝜕
𝜕𝑊 𝑖𝑛,1
(σ{𝑧 1
}𝑤1,3) ]
𝜕𝐶
𝜕𝑊 𝑖𝑛,1
= 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4σ{𝑧 2
}𝑤1,2σ{𝑧 1
}
𝜕
𝜕𝑊 𝑖𝑛,1
(𝑋𝑖𝑛 𝑤𝑖𝑛,1) + 𝑤3,4σ{𝑧 3
} 𝑤1,3 σ{𝑧 1
}
𝜕
𝜕𝑊 𝑖𝑛,1
(𝑋𝑖𝑛 𝑤𝑖𝑛,1) ]
𝜕𝐶
𝜕𝑊 𝑖𝑛,1
= 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4σ{𝑧 2
}𝑤1,2σ{𝑧 1
} 𝑋𝑖𝑛 + 𝑤3,4σ{𝑧 3
} 𝑤1,3 σ{𝑧 1
}𝑋𝑖𝑛 ]
= (𝑦𝑎 − 𝑦𝑡) 𝑋𝑖𝑛[ σ{𝑧(2)
}[1 − σ{𝑧(2)
}]𝑤2,4 σ{𝑧(1)
}[1 − σ{𝑧(1)
}] 𝑤1,2 + σ{𝑧(3)
}[1 − σ{𝑧(3)
}]𝑤3,4 σ{𝑧(1)
}[1 − σ{𝑧(1)
}]𝑤1,3]
𝑧(1)
𝑧(1)
8
3. Jump to the Backpropagation
1. Derivative Relationship between weights
1-3. The derivative for a weight is not dependent on the derivatives of any of the other weights in the same layer.
This is easy, so I will not explain it here.(homework )
𝑋1
𝑋2
1
2
3
4
5
6
𝑤13 𝑤36
𝑤14
𝑤15
𝑤23
𝑤24
𝑤25
𝑤46
𝑤56
𝑤(1)
𝑤(2)
Independant
9
3. Jump to the Backpropagation
2. Application of Gradient descent
𝑊𝑖→𝑗,𝑛𝑒𝑤 = 𝑊𝑖→𝑗,𝑜𝑙𝑑 − η
𝜕𝐶
𝜕𝑊 𝑖→𝑗,𝑜𝑙𝑑
(η is learning rate)
① At first, We initialize weights and biases with initializer  we know!
② we can control the learning rate  we know!
③ we can get this value through the equation  we know!
Then, we can renew the weights using above equation. But, is not it too difficult to apply?
So, We will define Error Signal for simple application.
① ② ③
10
3. Jump to the Backpropagation
3. Error Signals
1-1 Defintion: δ𝒋 =
𝝏𝑪
𝝏𝒁 𝒋
1-2 General Form of Signals
δj =
𝜕C
𝜕Zj
=
𝜕
𝜕Zj
1
2
(𝑦𝑎 − 𝑦𝑡)2
= (𝑦𝑎 − 𝑦𝑡)
𝜕𝑦 𝑎
𝜕𝑍 𝑗
------- ①
𝜕𝑦 𝑎
𝜕𝑍 𝑗
=
𝜕𝑦 𝑎
𝜕𝑎 𝑗
𝜕𝑎 𝑗
𝜕𝑍 𝑗
=
𝜕𝑦 𝑎
𝜕𝑎 𝑗
× σ(𝑧𝑗) (∵ 𝑎𝑗 = σ(𝑧𝑗))
Because neural network consists of Multiple units, we can think all of the units 𝑘 ∈ 𝑜𝑢𝑡𝑠 𝑗 .
So,
𝜕𝑦 𝑎
𝜕𝑍 𝑗
= σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗
𝜕𝑦 𝑎
𝜕𝑧 𝑘
𝜕𝑧 𝑘
𝜕𝑎 𝑗
𝜕𝑦 𝑎
𝜕𝑍 𝑗
= σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗
𝜕𝑦 𝑎
𝜕𝑧 𝑘
𝑤𝑗𝑘 (∵ 𝑧 𝑘 = 𝑤𝑗𝑘 𝑎𝑗)
By above equation ① and δk = (𝑦𝑎 − 𝑦𝑡)
𝜕𝑦 𝑎
𝜕𝑍 𝑘
δj = (𝑦𝑎 − 𝑦𝑡) σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗
𝜕𝑦 𝑎
𝜕𝑧 𝑘
𝑤𝑗𝑘 = (𝑦𝑎 − 𝑦𝑡)σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗
δk
(𝑦 𝑎−𝑦𝑡)
𝑤𝑗𝑘
∴ δj= σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗 δk 𝑤𝑗𝑘 , and for starting, we define δ𝑖𝑛𝑖𝑡𝑖𝑎𝑙 = (𝑦𝑎 − 𝑦𝑡)σ{𝑧(𝑖𝑛𝑖𝑡𝑖𝑎𝑙)
}[1 − σ{𝑧(𝑖𝑛𝑖𝑡𝑖𝑎𝑙)
}]
11
3. Jump to the Backpropagation
3. Error Signals
1-3 The General Form of weight variation
( ※ 𝑊3→6,𝑛𝑒𝑤= 𝑊3→6,𝑜𝑙𝑑 − η
𝜕𝐶
𝜕𝑊3→6,𝑜𝑙𝑑
)
( ※ δ6 = δ𝑖𝑛𝑖𝑡𝑖𝑎𝑙 = (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6)
}[1 − σ{𝑧(6)
}] )
∆𝑊3,6 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6)
}[1 − σ{𝑧(6)
}] 𝑎(3)
= −ηδ6 𝑎(3)
∆𝑊4,6 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6)
}[1 − σ{𝑧(6)
}] 𝑎(4)
= −ηδ6 𝑎(4)
∆𝑊5,6 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6)
}[1 − σ{𝑧(6)
}] 𝑎(5)
= −ηδ6 𝑎(5)
∆𝑊1,3 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6)
}[1 − σ{𝑧(6)
}] 𝑊3,6 σ{𝑧(3)
}[1 − σ{𝑧(3)
}] 𝑎(1)
= −η ( 𝑘∈𝑜𝑢𝑡𝑠 𝑗 δ6) × 𝑤3,6 σ(𝑧3) 𝑎(1)
= −ηδ3 𝑎(1)
………
∴ ∆𝑊𝑖,𝑗= −ηδ𝑗 𝑎(𝑖)
 We can easily renew weights by using Error Signals δ and Equation ∆𝑾𝒊,𝒋= −𝜼𝜹𝒋 𝒂(𝒊)
𝑋1
𝑋2
𝑧(1)
| 𝑎(1)
𝑧(2)
| 𝑎(2)
𝑧(3)
| 𝑎(3)
𝑧(4)
| 𝑎(4)
𝑧(5)
| 𝑎(5)
𝑧(6)
| 𝑎(6)
𝑤13 𝑤36
𝑤14
𝑤15
𝑤23
𝑤24
𝑤25
𝑤46
𝑤56
𝑤𝑖𝑛,1
𝑤𝑖𝑛,2
12
4. Summarize
Although the picture below is a bit different from my description, Calculations will show you that this is exactly the same as my explanation.
(Picture Source: http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html) 13
4. Summarize
Although the picture below is a bit different from my description, Calculations will show you that this is exactly the same as my explanation.
(Picture Source: http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html) 14

Mais conteúdo relacionado

Mais procurados

Hawkinrad a source_notes ii _secured
Hawkinrad a source_notes ii _securedHawkinrad a source_notes ii _secured
Hawkinrad a source_notes ii _securedfoxtrot jp R
 
Maximum Likelihood Estimation of Beetle
Maximum Likelihood Estimation of BeetleMaximum Likelihood Estimation of Beetle
Maximum Likelihood Estimation of BeetleLiang Kai Hu
 
Lecture 4 neural networks
Lecture 4 neural networksLecture 4 neural networks
Lecture 4 neural networksParveenMalik18
 
Variational Autoencoder Tutorial
Variational Autoencoder Tutorial Variational Autoencoder Tutorial
Variational Autoencoder Tutorial Hojin Yang
 
Orthogonal basis and gram schmidth process
Orthogonal basis and gram schmidth processOrthogonal basis and gram schmidth process
Orthogonal basis and gram schmidth processgidc engineering college
 
2 backlash simulation
2 backlash simulation2 backlash simulation
2 backlash simulationSolo Hermelin
 
Rotation in 3d Space: Euler Angles, Quaternions, Marix Descriptions
Rotation in 3d Space: Euler Angles, Quaternions, Marix DescriptionsRotation in 3d Space: Euler Angles, Quaternions, Marix Descriptions
Rotation in 3d Space: Euler Angles, Quaternions, Marix DescriptionsSolo Hermelin
 
Lecture 1 computational intelligence
Lecture 1  computational intelligenceLecture 1  computational intelligence
Lecture 1 computational intelligenceParveenMalik18
 
Gamma & Beta functions
Gamma & Beta functionsGamma & Beta functions
Gamma & Beta functionsSelvaraj John
 
Metrics for generativemodels
Metrics for generativemodelsMetrics for generativemodels
Metrics for generativemodelsDai-Hai Nguyen
 
B.tech ii unit-2 material beta gamma function
B.tech ii unit-2 material beta gamma functionB.tech ii unit-2 material beta gamma function
B.tech ii unit-2 material beta gamma functionRai University
 
Dealinggreensfncsolft sqrdb
Dealinggreensfncsolft sqrdbDealinggreensfncsolft sqrdb
Dealinggreensfncsolft sqrdbfoxtrot jp R
 
Gamma beta functions-1
Gamma   beta functions-1Gamma   beta functions-1
Gamma beta functions-1Selvaraj John
 
Beta & Gamma Functions
Beta & Gamma FunctionsBeta & Gamma Functions
Beta & Gamma FunctionsDrDeepaChauhan
 
5 cramer-rao lower bound
5 cramer-rao lower bound5 cramer-rao lower bound
5 cramer-rao lower boundSolo Hermelin
 
Gram-Schmidt Orthogonalization and QR Decompositon
Gram-Schmidt Orthogonalization and QR Decompositon Gram-Schmidt Orthogonalization and QR Decompositon
Gram-Schmidt Orthogonalization and QR Decompositon Mohammad Umar Rehman
 
Btech_II_ engineering mathematics_unit2
Btech_II_ engineering mathematics_unit2Btech_II_ engineering mathematics_unit2
Btech_II_ engineering mathematics_unit2Rai University
 

Mais procurados (20)

Hawkinrad a source_notes ii _secured
Hawkinrad a source_notes ii _securedHawkinrad a source_notes ii _secured
Hawkinrad a source_notes ii _secured
 
Maximum Likelihood Estimation of Beetle
Maximum Likelihood Estimation of BeetleMaximum Likelihood Estimation of Beetle
Maximum Likelihood Estimation of Beetle
 
Lecture 4 neural networks
Lecture 4 neural networksLecture 4 neural networks
Lecture 4 neural networks
 
Variational Autoencoder Tutorial
Variational Autoencoder Tutorial Variational Autoencoder Tutorial
Variational Autoencoder Tutorial
 
Orthogonal basis and gram schmidth process
Orthogonal basis and gram schmidth processOrthogonal basis and gram schmidth process
Orthogonal basis and gram schmidth process
 
2 backlash simulation
2 backlash simulation2 backlash simulation
2 backlash simulation
 
Rotation in 3d Space: Euler Angles, Quaternions, Marix Descriptions
Rotation in 3d Space: Euler Angles, Quaternions, Marix DescriptionsRotation in 3d Space: Euler Angles, Quaternions, Marix Descriptions
Rotation in 3d Space: Euler Angles, Quaternions, Marix Descriptions
 
Lecture 1 computational intelligence
Lecture 1  computational intelligenceLecture 1  computational intelligence
Lecture 1 computational intelligence
 
Dyadics
DyadicsDyadics
Dyadics
 
Gamma & Beta functions
Gamma & Beta functionsGamma & Beta functions
Gamma & Beta functions
 
HERMITE SERIES
HERMITE SERIESHERMITE SERIES
HERMITE SERIES
 
Metrics for generativemodels
Metrics for generativemodelsMetrics for generativemodels
Metrics for generativemodels
 
17 Disjoint Set Representation
17 Disjoint Set Representation17 Disjoint Set Representation
17 Disjoint Set Representation
 
B.tech ii unit-2 material beta gamma function
B.tech ii unit-2 material beta gamma functionB.tech ii unit-2 material beta gamma function
B.tech ii unit-2 material beta gamma function
 
Dealinggreensfncsolft sqrdb
Dealinggreensfncsolft sqrdbDealinggreensfncsolft sqrdb
Dealinggreensfncsolft sqrdb
 
Gamma beta functions-1
Gamma   beta functions-1Gamma   beta functions-1
Gamma beta functions-1
 
Beta & Gamma Functions
Beta & Gamma FunctionsBeta & Gamma Functions
Beta & Gamma Functions
 
5 cramer-rao lower bound
5 cramer-rao lower bound5 cramer-rao lower bound
5 cramer-rao lower bound
 
Gram-Schmidt Orthogonalization and QR Decompositon
Gram-Schmidt Orthogonalization and QR Decompositon Gram-Schmidt Orthogonalization and QR Decompositon
Gram-Schmidt Orthogonalization and QR Decompositon
 
Btech_II_ engineering mathematics_unit2
Btech_II_ engineering mathematics_unit2Btech_II_ engineering mathematics_unit2
Btech_II_ engineering mathematics_unit2
 

Semelhante a Understanding Backpropagation: A Concise Guide to the Powerful Neural Network Training Method

International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Semana 24 funciones iv álgebra uni ccesa007
Semana 24 funciones iv álgebra uni ccesa007Semana 24 funciones iv álgebra uni ccesa007
Semana 24 funciones iv álgebra uni ccesa007Demetrio Ccesa Rayme
 
Tenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear Equations
Tenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear EquationsTenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear Equations
Tenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear EquationsQUESTJOURNAL
 
publisher in research
publisher in researchpublisher in research
publisher in researchrikaseorika
 
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix MappingDual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mappinginventionjournals
 
Matrix Transformations on Some Difference Sequence Spaces
Matrix Transformations on Some Difference Sequence SpacesMatrix Transformations on Some Difference Sequence Spaces
Matrix Transformations on Some Difference Sequence SpacesIOSR Journals
 
BSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICSBSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICSRai University
 
BSC_Computer Science_Discrete Mathematics_Unit-I
BSC_Computer Science_Discrete Mathematics_Unit-IBSC_Computer Science_Discrete Mathematics_Unit-I
BSC_Computer Science_Discrete Mathematics_Unit-IRai University
 
BSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICSBSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICSRai University
 
Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets
Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets
Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets Vladimir Godovalov
 
B.tech ii unit-5 material vector integration
B.tech ii unit-5 material vector integrationB.tech ii unit-5 material vector integration
B.tech ii unit-5 material vector integrationRai University
 
One solution for many linear partial differential equations with terms of equ...
One solution for many linear partial differential equations with terms of equ...One solution for many linear partial differential equations with terms of equ...
One solution for many linear partial differential equations with terms of equ...Lossian Barbosa Bacelar Miranda
 
Functions of severable variables
Functions of severable variablesFunctions of severable variables
Functions of severable variablesSanthanam Krishnan
 

Semelhante a Understanding Backpropagation: A Concise Guide to the Powerful Neural Network Training Method (20)

Z transforms
Z transformsZ transforms
Z transforms
 
E0561719
E0561719E0561719
E0561719
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Semana 24 funciones iv álgebra uni ccesa007
Semana 24 funciones iv álgebra uni ccesa007Semana 24 funciones iv álgebra uni ccesa007
Semana 24 funciones iv álgebra uni ccesa007
 
Tenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear Equations
Tenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear EquationsTenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear Equations
Tenth-Order Iterative Methods withoutDerivatives forSolving Nonlinear Equations
 
Taller 1 parcial 3
Taller 1 parcial 3Taller 1 parcial 3
Taller 1 parcial 3
 
Algebra 6
Algebra 6Algebra 6
Algebra 6
 
publisher in research
publisher in researchpublisher in research
publisher in research
 
lec32.ppt
lec32.pptlec32.ppt
lec32.ppt
 
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix MappingDual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
Dual Spaces of Generalized Cesaro Sequence Space and Related Matrix Mapping
 
Matrix Transformations on Some Difference Sequence Spaces
Matrix Transformations on Some Difference Sequence SpacesMatrix Transformations on Some Difference Sequence Spaces
Matrix Transformations on Some Difference Sequence Spaces
 
BSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICSBSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-2_DISCRETE MATHEMATICS
 
BSC_Computer Science_Discrete Mathematics_Unit-I
BSC_Computer Science_Discrete Mathematics_Unit-IBSC_Computer Science_Discrete Mathematics_Unit-I
BSC_Computer Science_Discrete Mathematics_Unit-I
 
BSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICSBSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICS
BSC_COMPUTER _SCIENCE_UNIT-1_DISCRETE MATHEMATICS
 
Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets
Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets
Comparative analysis of x^3+y^3=z^3 and x^2+y^2=z^2 in the Interconnected Sets
 
J256979
J256979J256979
J256979
 
B.tech ii unit-5 material vector integration
B.tech ii unit-5 material vector integrationB.tech ii unit-5 material vector integration
B.tech ii unit-5 material vector integration
 
Periodic Solutions for Nonlinear Systems of Integro-Differential Equations of...
Periodic Solutions for Nonlinear Systems of Integro-Differential Equations of...Periodic Solutions for Nonlinear Systems of Integro-Differential Equations of...
Periodic Solutions for Nonlinear Systems of Integro-Differential Equations of...
 
One solution for many linear partial differential equations with terms of equ...
One solution for many linear partial differential equations with terms of equ...One solution for many linear partial differential equations with terms of equ...
One solution for many linear partial differential equations with terms of equ...
 
Functions of severable variables
Functions of severable variablesFunctions of severable variables
Functions of severable variables
 

Mais de 강민국 강민국

Mais de 강민국 강민국 (11)

PR-190: A Baseline For Detecting Misclassified and Out-of-Distribution Examp...
PR-190: A Baseline For Detecting Misclassified and Out-of-Distribution  Examp...PR-190: A Baseline For Detecting Misclassified and Out-of-Distribution  Examp...
PR-190: A Baseline For Detecting Misclassified and Out-of-Distribution Examp...
 
Deeppermnet
DeeppermnetDeeppermnet
Deeppermnet
 
[Pr12] deep anomaly detection using geometric transformations
[Pr12] deep anomaly detection using geometric transformations[Pr12] deep anomaly detection using geometric transformations
[Pr12] deep anomaly detection using geometric transformations
 
[Pr12] self supervised gan
[Pr12] self supervised gan[Pr12] self supervised gan
[Pr12] self supervised gan
 
Ebgan
EbganEbgan
Ebgan
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Deep Feature Consistent VAE
Deep Feature Consistent VAEDeep Feature Consistent VAE
Deep Feature Consistent VAE
 
[Probability for machine learning]
[Probability for machine learning][Probability for machine learning]
[Probability for machine learning]
 
Deep learning overview
Deep learning overviewDeep learning overview
Deep learning overview
 
Generative adversarial network
Generative adversarial networkGenerative adversarial network
Generative adversarial network
 
Variational AutoEncoder(VAE)
Variational AutoEncoder(VAE)Variational AutoEncoder(VAE)
Variational AutoEncoder(VAE)
 

Último

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 

Último (20)

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 

Understanding Backpropagation: A Concise Guide to the Powerful Neural Network Training Method

  • 1. Understanding of Backpropagation Kang, Min-Guk 𝑋1 𝑋2 𝑧(1) | 𝑎(1) 𝑧(2) | 𝑎(2) 𝑧(3) | 𝑎(3) 𝑧(4) | 𝑎(4) 𝑧(5) | 𝑎(5) 𝑧(6) | 𝑎(6) 𝑤13 𝑤36 𝑤14 𝑤15 𝑤23 𝑤24 𝑤25 𝑤46 𝑤56 𝑤𝑖𝑛,1 𝑤𝑖𝑛,2 1
  • 2. 1. What is Backpropagation ? 1. Definition Backpropagation is an algorithm for supervised learning of artificial neural networks using gradient descent. 2. History Backpropagation algorism was developed in the 1970s, but in 1986, Rumelhart, Hinton and Williams showed experimentally that this method can generate useful internal representations of incoming data in Hidden layers of neural networks. 3. How to use Backpropagation? Backpropagation consists of using simple chain rules. However, we often use non-linear functions for activation functions, It is hard for us to use backpropagation. (In this case, I will use sigmoid function for activation function.) 2
  • 3. 2. Preparations 1. Cost function(Loss function) I will use below cost function(𝑦𝑎 is value of hypothesis, 𝑦𝑡 is value of true) 𝐶 = 1 2 (𝑦𝑎 − 𝑦𝑡)2 2. Derivative of sigmoid function 𝑑𝑆(𝑧) 𝑑𝑧 = 1 (1+𝑒−𝑧)2 × −1 × −𝑒−𝑧 = 𝑆(𝑧)(1 − 𝑆(𝑧)) ∵ Sigmoid function = S(z) = 1 1+𝑒−𝑧 , 𝐹 𝑧 =( 1 𝑔 𝑧 ) = − 𝑔 𝑧 𝑔 𝑧 2 3. How to renew weights using Gradient descent 𝑊𝑖→𝑗,𝑛𝑒𝑤 = 𝑊𝑖→𝑗,𝑜𝑙𝑑 − η 𝜕𝐶 𝜕𝑊 𝑖→𝑗,𝑜𝑙𝑑 (η is learning rate) 3
  • 4. 3. Jump to the Backpropagation 1. Derivative Relationship between weights 1-1. The weight update is dependent on derivatives that reside previous layers.(The Word previous means it is located right side ) 𝐶 = 1 2 (𝑦𝑎 − 𝑦𝑡)2 → 𝜕𝐶 𝜕𝑊2,3 = (𝑦𝑎 − 𝑦𝑡) × 𝜕𝑦 𝑎 𝜕𝑊2,3 = (𝑦𝑎 − 𝑦𝑡) × 𝜕 𝜕𝑊2,3 [σ{𝑧(3) }] (σ is sigmoid function) 𝜕𝐶 𝜕𝑊2,3 = (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3) } × [1 − σ{𝑧(3) }] × 𝜕𝑍3 𝜕𝑊2,3 = (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3) } × [1 − σ{𝑧(3) }] × 𝜕 𝜕𝑊2,3 ( 𝑎(2) 𝑤2,3) ∴ 𝜕𝐶 𝜕𝑊2,3 = (𝑦𝑎 − 𝑦𝑡)σ{𝑧(3) }[1 − σ{𝑧(3) }] 𝑎(2) 1𝑋𝑖𝑛 2 3 𝑤1,2 𝑤2,3 𝑤𝑖𝑛,1 𝑧(2) | 𝑎(2) 𝑧(1) | 𝑎(1) 𝑧(3) | 𝑎(3) 𝑎(3) = 𝑦𝑎 Feed forward Backpropagation 4
  • 5. 3. Jump to the Backpropagation 1. Derivative Relationship between weights 1-1. The weight update is dependent on derivatives that reside previous layers.(The Word previous means it is located right side ) 𝐶 = 1 2 (𝑦𝑎 − 𝑦𝑡)2 → 𝜕𝐶 𝜕𝑊1,2 = (𝑦𝑎 − 𝑦𝑡) × 𝜕𝑦 𝑎 𝜕𝑊1,2 = (𝑦𝑎 − 𝑦𝑡) × 𝜕 𝜕𝑊1,2 [σ{𝑧(3) }] (σ is sigmoid function) 𝜕𝐶 𝜕𝑊1,2 = (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3) } × [1 − σ{𝑧(3) }] × 𝜕𝑍3 𝜕𝑊1,2 = (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3) } × [1 − σ{𝑧(3) }] × 𝜕 𝜕𝑊1,2 ( 𝑎(2) 𝑤2,3) 𝜕𝐶 𝜕𝑊1,2 = (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3) } × [1 − σ{𝑧(3) }] × 𝑤2,3 × 𝜕 𝜕𝑊1,2 𝑎(2) = (𝑦𝑎 − 𝑦𝑡) ×σ{𝑧(3) } × [1 − σ{𝑧(3) }] × 𝑤2,3 × 𝜕 𝜕𝑊1,2 σ{𝑧(2) } ∴ 𝜕𝐶 𝜕𝑊1,2 = (𝑦𝑎 − 𝑦𝑡)σ{𝑧(3) }[1 − σ{𝑧(3) }] 𝑤2,3 σ{𝑧(2) }[1 − σ{𝑧(2) } ] 𝑎(1) 1𝑋𝑖𝑛 2 3 𝑤1,2 𝑤2,3 𝑤𝑖𝑛,1 𝑧(2) | 𝑎(2) 𝑧(1) | 𝑎(1) 𝑧(3) | 𝑎(3) 𝑎(3) = 𝑦𝑎 Feed forward Backpropagation 5
  • 6. 3. Jump to the Backpropagation 1. Derivative Relationship between weights 1-1. The weight update is dependent on derivatives that reside previous layers.(The Word previous means it is located right side ) 𝐶 = 1 2 (𝑦𝑎 − 𝑦𝑡)2 → 𝜕𝐶 𝜕𝑊 𝑖𝑛,1 = (𝑦𝑎 − 𝑦𝑡) × 𝜕𝑦 𝑎 𝜕𝑊 𝑖𝑛,1 = (𝑦𝑎 − 𝑦𝑡) × 𝜕 𝜕𝑊 𝑖𝑛,1 [σ{𝑧(3) }] (σ is sigmoid function) Using same way, we will get below equation. 𝜕𝐶 𝜕𝑊 𝑖𝑛,1 = (𝑦𝑎 − 𝑦𝑡)σ{𝑧(3) }[1 − σ{𝑧(3) }] 𝑤2,3 σ{𝑧(2) }[1 − σ{𝑧(2) } ] 𝑤1,2 σ{𝑧(1) }[1 − σ{𝑧(1) } ] 𝑋𝑖𝑛 1𝑋𝑖𝑛 2 3 𝑤1,2 𝑤2,3 𝑤𝑖𝑛,1 𝑧(2) | 𝑎(2) 𝑧(1) | 𝑎(1) 𝑧(3) | 𝑎(3) 𝑎(3) = 𝑦𝑎 Feed forward Backpropagation 6
  • 7. 3. Jump to the Backpropagation 1. Derivative Relationship between weights 1-2. The weight update is dependent on derivatives that reside on both paths. To get the result, you have to do more tedious calculations than the previous one. So I now just write the result of it. If you want to know the calculation process, look at the next slide! 𝜕𝐶 𝜕𝑊 𝑖𝑛,1 = (𝑦𝑎 − 𝑦𝑡) 𝑋𝑖𝑛[ σ{𝑧(2) }[1 − σ{𝑧(2) }]𝑤2,4 σ{𝑧(1) }[1 − σ{𝑧(1) }] 𝑤1,2 + σ{𝑧(3) }[1 − σ{𝑧(3) }]𝑤3,4 σ{𝑧(1) }[1 − σ{𝑧(1) }]𝑤1,3] ① ② ③ ④ 2 3 𝑋𝑖𝑛 1 𝑤𝑖𝑛,1 4 𝑎(3) = 𝑦𝑎 Feed forward 𝑧(4) | 𝑎(4) 𝑧(3) | 𝑎(3) 𝑧(1)| 𝑎(1) 𝑤1,2 𝑤2,4 𝑤3,4𝑤1,3 𝑧(2) | 𝑎(2) ①② ③④ 7
  • 8. 2 3 𝑋𝑖𝑛 1 𝑤𝑖𝑛,1 4 𝑎(3) = 𝑦𝑎 Feed forward 𝑧(4) | 𝑎(4) 𝑧(3) | 𝑎(3) 𝑧(1) | 𝑎(1) 𝑤1,2 𝑤2,4 𝑤3,4𝑤1,3 𝑧(2) | 𝑎(2) 𝜕𝐶 𝜕𝑊 𝑖𝑛,1 = 𝜕 𝜕𝑊 𝑖𝑛,1 1 2 (𝑦𝑎 − 𝑦𝑡)2 = 𝑦𝑎 − 𝑦𝑡 ( 𝜕 𝜕𝑊 𝑖𝑛,1 (σ{𝑧 2 }𝑤2,4 + σ{𝑧 3 }𝑤3,4)) 𝜕𝐶 𝜕𝑊 𝑖𝑛,1 = 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4 𝜕 𝜕𝑊 𝑖𝑛,1 σ{𝑧 2 } + 𝑤3,4 𝜕 𝜕𝑊 𝑖𝑛,1 σ{𝑧 3 } ] = 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4σ{𝑧 2 } 𝜕 𝜕𝑊 𝑖𝑛,1 (σ{𝑧 1 }𝑤1,2)+ 𝑤3,4σ{𝑧 3 } 𝜕 𝜕𝑊 𝑖𝑛,1 (σ{𝑧 1 }𝑤1,3) ] 𝜕𝐶 𝜕𝑊 𝑖𝑛,1 = 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4σ{𝑧 2 }𝑤1,2σ{𝑧 1 } 𝜕 𝜕𝑊 𝑖𝑛,1 (𝑋𝑖𝑛 𝑤𝑖𝑛,1) + 𝑤3,4σ{𝑧 3 } 𝑤1,3 σ{𝑧 1 } 𝜕 𝜕𝑊 𝑖𝑛,1 (𝑋𝑖𝑛 𝑤𝑖𝑛,1) ] 𝜕𝐶 𝜕𝑊 𝑖𝑛,1 = 𝑦𝑎 − 𝑦𝑡 [ 𝑤2,4σ{𝑧 2 }𝑤1,2σ{𝑧 1 } 𝑋𝑖𝑛 + 𝑤3,4σ{𝑧 3 } 𝑤1,3 σ{𝑧 1 }𝑋𝑖𝑛 ] = (𝑦𝑎 − 𝑦𝑡) 𝑋𝑖𝑛[ σ{𝑧(2) }[1 − σ{𝑧(2) }]𝑤2,4 σ{𝑧(1) }[1 − σ{𝑧(1) }] 𝑤1,2 + σ{𝑧(3) }[1 − σ{𝑧(3) }]𝑤3,4 σ{𝑧(1) }[1 − σ{𝑧(1) }]𝑤1,3] 𝑧(1) 𝑧(1) 8
  • 9. 3. Jump to the Backpropagation 1. Derivative Relationship between weights 1-3. The derivative for a weight is not dependent on the derivatives of any of the other weights in the same layer. This is easy, so I will not explain it here.(homework ) 𝑋1 𝑋2 1 2 3 4 5 6 𝑤13 𝑤36 𝑤14 𝑤15 𝑤23 𝑤24 𝑤25 𝑤46 𝑤56 𝑤(1) 𝑤(2) Independant 9
  • 10. 3. Jump to the Backpropagation 2. Application of Gradient descent 𝑊𝑖→𝑗,𝑛𝑒𝑤 = 𝑊𝑖→𝑗,𝑜𝑙𝑑 − η 𝜕𝐶 𝜕𝑊 𝑖→𝑗,𝑜𝑙𝑑 (η is learning rate) ① At first, We initialize weights and biases with initializer  we know! ② we can control the learning rate  we know! ③ we can get this value through the equation  we know! Then, we can renew the weights using above equation. But, is not it too difficult to apply? So, We will define Error Signal for simple application. ① ② ③ 10
  • 11. 3. Jump to the Backpropagation 3. Error Signals 1-1 Defintion: δ𝒋 = 𝝏𝑪 𝝏𝒁 𝒋 1-2 General Form of Signals δj = 𝜕C 𝜕Zj = 𝜕 𝜕Zj 1 2 (𝑦𝑎 − 𝑦𝑡)2 = (𝑦𝑎 − 𝑦𝑡) 𝜕𝑦 𝑎 𝜕𝑍 𝑗 ------- ① 𝜕𝑦 𝑎 𝜕𝑍 𝑗 = 𝜕𝑦 𝑎 𝜕𝑎 𝑗 𝜕𝑎 𝑗 𝜕𝑍 𝑗 = 𝜕𝑦 𝑎 𝜕𝑎 𝑗 × σ(𝑧𝑗) (∵ 𝑎𝑗 = σ(𝑧𝑗)) Because neural network consists of Multiple units, we can think all of the units 𝑘 ∈ 𝑜𝑢𝑡𝑠 𝑗 . So, 𝜕𝑦 𝑎 𝜕𝑍 𝑗 = σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗 𝜕𝑦 𝑎 𝜕𝑧 𝑘 𝜕𝑧 𝑘 𝜕𝑎 𝑗 𝜕𝑦 𝑎 𝜕𝑍 𝑗 = σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗 𝜕𝑦 𝑎 𝜕𝑧 𝑘 𝑤𝑗𝑘 (∵ 𝑧 𝑘 = 𝑤𝑗𝑘 𝑎𝑗) By above equation ① and δk = (𝑦𝑎 − 𝑦𝑡) 𝜕𝑦 𝑎 𝜕𝑍 𝑘 δj = (𝑦𝑎 − 𝑦𝑡) σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗 𝜕𝑦 𝑎 𝜕𝑧 𝑘 𝑤𝑗𝑘 = (𝑦𝑎 − 𝑦𝑡)σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗 δk (𝑦 𝑎−𝑦𝑡) 𝑤𝑗𝑘 ∴ δj= σ(𝑧𝑗) 𝑘∈𝑜𝑢𝑡𝑠 𝑗 δk 𝑤𝑗𝑘 , and for starting, we define δ𝑖𝑛𝑖𝑡𝑖𝑎𝑙 = (𝑦𝑎 − 𝑦𝑡)σ{𝑧(𝑖𝑛𝑖𝑡𝑖𝑎𝑙) }[1 − σ{𝑧(𝑖𝑛𝑖𝑡𝑖𝑎𝑙) }] 11
  • 12. 3. Jump to the Backpropagation 3. Error Signals 1-3 The General Form of weight variation ( ※ 𝑊3→6,𝑛𝑒𝑤= 𝑊3→6,𝑜𝑙𝑑 − η 𝜕𝐶 𝜕𝑊3→6,𝑜𝑙𝑑 ) ( ※ δ6 = δ𝑖𝑛𝑖𝑡𝑖𝑎𝑙 = (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6) }[1 − σ{𝑧(6) }] ) ∆𝑊3,6 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6) }[1 − σ{𝑧(6) }] 𝑎(3) = −ηδ6 𝑎(3) ∆𝑊4,6 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6) }[1 − σ{𝑧(6) }] 𝑎(4) = −ηδ6 𝑎(4) ∆𝑊5,6 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6) }[1 − σ{𝑧(6) }] 𝑎(5) = −ηδ6 𝑎(5) ∆𝑊1,3 = −η (𝑦𝑎 − 𝑦𝑡)σ{𝑧(6) }[1 − σ{𝑧(6) }] 𝑊3,6 σ{𝑧(3) }[1 − σ{𝑧(3) }] 𝑎(1) = −η ( 𝑘∈𝑜𝑢𝑡𝑠 𝑗 δ6) × 𝑤3,6 σ(𝑧3) 𝑎(1) = −ηδ3 𝑎(1) ……… ∴ ∆𝑊𝑖,𝑗= −ηδ𝑗 𝑎(𝑖)  We can easily renew weights by using Error Signals δ and Equation ∆𝑾𝒊,𝒋= −𝜼𝜹𝒋 𝒂(𝒊) 𝑋1 𝑋2 𝑧(1) | 𝑎(1) 𝑧(2) | 𝑎(2) 𝑧(3) | 𝑎(3) 𝑧(4) | 𝑎(4) 𝑧(5) | 𝑎(5) 𝑧(6) | 𝑎(6) 𝑤13 𝑤36 𝑤14 𝑤15 𝑤23 𝑤24 𝑤25 𝑤46 𝑤56 𝑤𝑖𝑛,1 𝑤𝑖𝑛,2 12
  • 13. 4. Summarize Although the picture below is a bit different from my description, Calculations will show you that this is exactly the same as my explanation. (Picture Source: http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html) 13
  • 14. 4. Summarize Although the picture below is a bit different from my description, Calculations will show you that this is exactly the same as my explanation. (Picture Source: http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html) 14